## Fundamentals

### Pandas Library

Pandas is a powerful and flexible open-source data analysis and manipulation library for Python. It provides data structures and functions needed to manipulate structured data seamlessly. The primary data structures in Pandas are `Series` (one-dimensional) and `DataFrame` (two-dimensional).

#### Key Features:
- **DataFrame Object**: A two-dimensional, size-mutable, and potentially heterogeneous tabular data structure with labeled axes (rows and columns).
- **Data Alignment**: Automatic and explicit data alignment, making it easy to work with incomplete or missing data.
- **Data Cleaning**: Tools for loading data from different file formats (CSV, Excel, SQL, etc.) and handling missing data.
- **Data Transformation**: Functions for merging, reshaping, selecting, and manipulating data.
- **Data Aggregation**: Group by functionality for performing split-apply-combine operations on data sets.
- **Time Series**: Powerful tools for working with time series data, including date range generation and frequency conversion.

#### Usage:
1. **Importing Pandas**:
    ```python
    import pandas as pd
    ```

2. **Creating a DataFrame**:
    ```python
    data = {'Name': ['Alice', 'Bob', 'Charlie'], 'Age': [25, 30, 35], 'Gender': ['F', 'M', 'M']}
    df = pd.DataFrame(data)
    ```

3. **Reading Data from a CSV File**:
    ```python
    df = pd.read_csv('path_to_file.csv')
    ```

4. **Displaying the First Few Rows**:
    ```python
    print(df.head())
    ```

5. **Basic DataFrame Operations**:
    - **Selecting Columns**:
        ```python
        df['Name']
        ```
    - **Filtering Rows**:
        ```python
        df[df['Age'] > 30]
        ```
    - **Adding a New Column**:
        ```python
        df['AgePlusOne'] = df['Age'] + 1
        ```

6. **Group By and Aggregation**:
    ```python
    df.groupby('Gender').mean()
    ```

Pandas is an essential tool for data scientists and analysts, providing robust functionality for data manipulation and analysis.

### Import Library

In [5]:
import pandas as pd

### Data Loading

In [6]:
# Read a CSV file into a DataFrame
df = pd.read_csv('04_pandas/sample.csv')

# Display the first few rows of the DataFrame
print(df.head())

   PatientID FirstName LastName DateOfBirth  Gender ContactNumber  \
0          1      John      Doe  1985-05-15    Male      555-1234   
1          2      Jane    Smith  1990-08-22  Female      555-5678   
2          3     Emily    Jones  1975-11-30  Female      555-8765   
3          4   Michael    Brown  1982-02-14    Male      555-4321   
4          5   Jessica  Johnson  2000-07-07  Female      555-6789   

                         Email       Address     City State  ZipCode  \
0         john.doe@example.com   123 Main St  Anytown    CA    12345   
1       jane.smith@example.com    456 Oak St  Anytown    CA    12345   
2      emily.jones@example.com   789 Pine St  Anytown    CA    12345   
3    michael.brown@example.com  321 Maple St  Anytown    CA    12345   
4  jessica.johnson@example.com  654 Cedar St  Anytown    CA    12345   

  DateOfAdmission     Diagnosis  
0      2023-01-10           Flu  
1      2023-02-15    Broken Arm  
2      2023-03-20      Diabetes  
3      2023-04-2

### Series

A series is a one-dimensional labeled array in pandas.

In [7]:
### Create a series from a list
data = [1, 2, 3, 4, 5]
s = pd.Series(data)
print(s)

0    1
1    2
2    3
3    4
4    5
dtype: int64


### Series Attributes and Methods

Pandas Series is a one-dimensional labeled array capable of holding any data type. It has various attributes and methods that allow for easy manipulation and analysis of the data.

#### Attributes:
1. **`index`**: The index (axis labels) of the Series.
    ```python
    s.index
    ```

2. **`values`**: The values of the Series.
    ```python
    s.values
    ```

3. **`dtype`**: The data type of the Series.
    ```python
    s.dtype
    ```

4. **`name`**: The name of the Series.
    ```python
    s.name
    ```

5. **`size`**: The number of elements in the Series.
    ```python
    s.size
    ```

6. **`shape`**: The shape of the Series.
    ```python
    s.shape
    ```

#### Methods:
1. **`head(n)`**: Returns the first `n` elements of the Series.
    ```python
    s.head(3)
    ```

2. **`tail(n)`**: Returns the last `n` elements of the Series.
    ```python
    s.tail(3)
    ```

3. **`describe()`**: Generates descriptive statistics of the Series.
    ```python
    s.describe()
    ```

4. **`mean()`**: Returns the mean of the Series.
    ```python
    s.mean()
    ```

5. **`sum()`**: Returns the sum of the Series.
    ```python
    s.sum()
    ```

6. **`unique()`**: Returns the unique values in the Series.
    ```python
    s.unique()
    ```

7. **`value_counts()`**: Returns a Series containing counts of unique values.
    ```python
    s.value_counts()
    ```

8. **`apply(func)`**: Applies a function to each element in the Series.
    ```python
    s.apply(lambda x: x * 2)
    ```

9. **`map(func)`**: Maps a function to each element in the Series.
    ```python
    s.map(lambda x: x * 2)
    ```

10. **`sort_values()`**: Sorts the Series by its values.
    ```python
    s.sort_values()
    ```

11. **`sort_index()`**: Sorts the Series by its index.
    ```python
    s.sort_index()
    ```

12. **`dropna()`**: Returns a Series with missing values removed.
    ```python
    s.dropna()
    ```

13. **`fillna(value)`**: Fills missing values with the specified value.
    ```python
    s.fillna(0)
    ```

14. **`astype(dtype)`**: Casts the Series to the specified data type.
    ```python
    s.astype(float)
    ```

These attributes and methods make it easy to manipulate and analyze data in a Pandas Series.