# OOP 4: Docstrings in Python Classes

In this notebook, we'll explore how to use docstrings in Python classes to document your code effectively. We'll look at how to document classes, methods, and attributes with docstrings, following best practices.

## Table of Contents
1. [Introduction to Docstrings](#1)
2. [Class Docstrings](#2)
3. [Method Docstrings](#3)
4. [Attribute Docstrings](#4)
5. [Best Practices for Writing Docstrings](#5)
6. [Exercise: Documenting a Data Analysis Class](#6)

---
## 1. Introduction to Docstrings <a id="1"></a>

Docstrings are a special type of string used to document modules, classes, methods, and functions in Python. They help other developers understand the purpose and usage of your code.

**Example**

In [2]:
def add(a, b):
    """
    Adds two numbers.

    Parameters:
    a (int, float): The first number.
    b (int, float): The second number.

    Returns:
    int, float: The sum of the two numbers.
    """
    return a + b

In [None]:
import pandas as pd

pd.DataFrame

---
## 2. Class Docstrings <a id="2"></a>

Class docstrings should provide a brief overview of the class, its purpose, and any important details.

**Example**

In [None]:
class DataScientist:
    """
    Represents a data scientist with a name and expertise level.

    Attributes:
    name (str): The name of the data scientist.
    expertise_level (str): The expertise level of the data scientist.
    """

    def __init__(self, name, expertise_level):
        self.name = name
        self.expertise_level = expertise_level

---
## 3. Method Docstrings <a id="3"></a>

Method docstrings should describe what the method does, its parameters, and its return value.

**Example**

In [None]:
class DataScientist:
    """
    Represents a data scientist with a name and expertise level.

    Attributes:
    name (str): The name of the data scientist.
    expertise_level (str): The expertise level of the data scientist.
    """

    def __init__(self, name, expertise_level):
        """
        Initializes a DataScientist object with a name and expertise level.

        Parameters:
        name (str): The name of the data scientist.
        expertise_level (str): The expertise level of the data scientist.
        """
        self.name = name
        self.expertise_level = expertise_level
    
    def analyze_data(self, data):
        """
        Analyzes the provided data.

        Parameters:
        data (list): A list of numerical data.

        Returns:
        dict: A dictionary containing the mean and median of the data.
        """
        import statistics
        return {
            'mean': statistics.mean(data),
            'median': statistics.median(data)
        }

---
## 4. Attribute Docstrings <a id="4"></a>

While not common, you can document attributes directly within the class docstring.

**Example**

In [None]:
class DataScientist:
    """
    Represents a data scientist with a name and expertise level.

    Attributes:
    name (str): The name of the data scientist.
    expertise_level (str): The expertise level of the data scientist.
    """
    
    def __init__(self, name, expertise_level):
        """
        Initializes a DataScientist object with a name and expertise level.

        Parameters:
        name (str): The name of the data scientist.
        expertise_level (str): The expertise level of the data scientist.
        """
        self.name = name 
        self.expertise_level = expertise_level

---
## 5. Best Practices for Writing Docstrings <a id="5"></a>
- **Clarity**: Write docstrings that are easy to understand.

- **Grammar**: Ensure your docstrings are well-written.

- **Consistency**: Use the same format for all docstrings in your project.

- **Details**: Document parameters, return values, and any important details about the class or method.

- **Quotes**: Use triple quotes (""") for writing docstrings in Python.

Some popular Docstring formats that are used (Google, NumPy, etc.) can be found [here](https://joshdimella.com/blog/python-docstring-formats-best-practices)

---
## 6. Exercise: Documenting a Data Analysis Class <a id="6"></a>

Create and document a `DataAnalysis` class that performs basic data analysis operations. The class should have the following methods:

- `__init__(self, data)`: Initializes the `DataAnalysis` object with a dictionary of data.
- `summary(self)`: Returns a summary of the data, including the number of rows and columns.
- `mean(self, column)`: Returns the mean of a specified column.
- `median(self, column)`: Returns the median of a specified column.

In [None]:
# your class definition here

><details>
><summary>Do you need some help?</summary>
> 
> Here is a working solution:
>
>```python
>import pandas as pd
>import statistics
>
>class DataAnalysis:
>    """
>    Performs basic data analysis operations on a given dataset.
>
>    Attributes:
>    data (dict): A dictionary of data where keys are column names and values are lists of data.
>    _df (DataFrame): A pandas DataFrame created from the data dictionary.
>    """
>
>    def __init__(self, data):
>        """
>        Initializes the DataAnalysis object with a dictionary of data.
>
>        Parameters:
>        data (dict): A dictionary of data where keys are column names and values are lists of data.
>        """
>        self.data = data
>        self._df = pd.DataFrame(data)
>    
>    def summary(self):
>        """
>        Returns a summary of the data, including the number of rows and columns.
>
>        Returns:
>        str: A string summarizing the number of rows and columns in the dataset.
>        """
>        return f"Dataset with {len(self._df)} rows and {len(self._df.columns)} columns"
>    
>    def mean(self, column):
>        """
>        Returns the mean of a specified column.
>
>        Parameters:
>        column (str): The column for which to calculate the mean.
>
>        Returns:
>        float: The mean of the specified column.
>        """
>        return self._df[column].mean()
>    
>    def median(self, column):
>        """
>        Returns the median of a specified column.
>
>        Parameters:
>        column (str): The column for which to calculate the median.
>
>        Returns:
>        float: The median of the specified column.
>        """
>        return statistics.median(self._df[column])
> ```
> </details>

Try now if your code worked as expected. Run the following cell:

In [None]:
data = {
    'age': [25, 30, 35, 40, 45],
    'salary': [50000, 60000, 70000, 80000, 90000]
}
analysis = DataAnalysis(data)
print(analysis.summary())
print(analysis.mean('age'))
print(analysis.median('salary'))