# Q1. Create a Pandas Series that contains the following data: 4, 8, 15, 16, 23, and 42. Then, print the series.

In [1]:
import pandas as pd

# Creating the Pandas Series
data = [4, 8, 15, 16, 23, 42]
series = pd.Series(data)

# Printing the series
series


0     4
1     8
2    15
3    16
4    23
5    42
dtype: int64

# Q2. Create a variable of list type containing 10 elements in it, and apply pandas.Series function on the variable print it.

In [3]:
# Creating a Pandas DataFrame with the given data
data = {
    'Name': ['Alice', 'Bob', 'Claire'],
    'Age': [25, 30, 27],
    'Gender': ['Female', 'Male', 'Female']
}

df = pd.DataFrame(data)

# Printing the DataFrame
df


Unnamed: 0,Name,Age,Gender
0,Alice,25,Female
1,Bob,30,Male
2,Claire,27,Female


# Q3. Create a Pandas DataFrame that contains the following data:

Then, print the DataFrame.
Name
Alice
Bob
Claire

Age
25
30
27

Gender
Female
Male
Female

In [2]:
# Creating a list with 10 elements
data_list = [3, 7, 12, 19, 25, 31, 38, 44, 50, 57]

# Converting the list to a Pandas Series
series_from_list = pd.Series(data_list)

# Printing the series
series_from_list


0     3
1     7
2    12
3    19
4    25
5    31
6    38
7    44
8    50
9    57
dtype: int64

# Q4. What is ‘DataFrame’ in pandas and how is it different from pandas.series? Explain with an example.

A **DataFrame** in pandas is a two-dimensional, size-mutable, and potentially heterogeneous tabular data structure with labeled axes (rows and columns). Think of it as a table or a spreadsheet where data is arranged in rows and columns.

A **Series**, on the other hand, is a one-dimensional array-like object that can hold any data type and is labeled by a single index. A Series is essentially a single column of data.

### Key Differences:
1. **Dimensionality**:
   - **Series**: One-dimensional (like a single column or row of data).
   - **DataFrame**: Two-dimensional (like a table with rows and columns).

2. **Data Structure**:
   - **Series**: Can be seen as a single column of data, indexed by a single index.
   - **DataFrame**: A collection of Series, where each Series represents a column.

3. **Usage**:
   - **Series**: Useful when dealing with a single list of data or when you need a labeled array.
   - **DataFrame**: Ideal for working with structured data where multiple columns and rows are involved.

### Example:

- **Series Example**:
   ```python
   import pandas as pd
   # Creating a Series
   age_series = pd.Series([25, 30, 27], index=['Alice', 'Bob', 'Claire'])
   print(age_series)
   ```

   **Output**:
   ```
   Alice     25
   Bob       30
   Claire    27
   dtype: int64
   ```

- **DataFrame Example**:
   ```python
   import pandas as pd
   # Creating a DataFrame
   data = {
       'Name': ['Alice', 'Bob', 'Claire'],
       'Age': [25, 30, 27],
       'Gender': ['Female', 'Male', 'Female']
   }
   df = pd.DataFrame(data)
   print(df)
   ```

   **Output**:
   ```
       Name  Age  Gender
   0  Alice   25  Female
   1    Bob   30    Male
   2  Claire   27  Female
   ```

In summary, while a **Series** is like a single column of data, a **DataFrame** is a table containing multiple columns, where each column can be a different data type.

# Q5. What are some common functions you can use to manipulate data in a Pandas DataFrame? Can you give an example of when you might use one of these functions?

Pandas offers a wide array of functions to manipulate and analyze data in a DataFrame. Here are some common functions:

### 1. **`head()` and `tail()`**
   - **Purpose**: To view the first or last few rows of a DataFrame.
   - **Example**: If you have a large dataset and want to quickly check the first few entries, you can use `df.head(5)` to see the first 5 rows.

### 2. **`describe()`**
   - **Purpose**: To generate summary statistics for numerical columns.
   - **Example**: Use `df.describe()` to get a quick overview of the mean, median, standard deviation, and other stats for each numerical column.

### 3. **`drop()`**
   - **Purpose**: To remove columns or rows from a DataFrame.
   - **Example**: If you want to remove a column called "Age", you can use `df.drop('Age', axis=1)`.

### 4. **`sort_values()`**
   - **Purpose**: To sort the DataFrame by one or more columns.
   - **Example**: Use `df.sort_values(by='Age')` to sort the DataFrame by the "Age" column in ascending order.

### 5. **`loc[]` and `iloc[]`**
   - **Purpose**: To access specific rows and columns by labels (`loc[]`) or by position (`iloc[]`).
   - **Example**: Use `df.loc[0, 'Name']` to access the value in the first row of the "Name" column.

### 6. **`groupby()`**
   - **Purpose**: To group data based on one or more columns and perform aggregate functions.
   - **Example**: If you want to find the average age by gender, you can use `df.groupby('Gender')['Age'].mean()`.

### 7. **`merge()`**
   - **Purpose**: To combine two DataFrames based on a common column (like SQL joins).
   - **Example**: If you have two DataFrames with different data on the same set of individuals, you can merge them with `pd.merge(df1, df2, on='Name')`.

### 8. **`apply()`**
   - **Purpose**: To apply a function to each element, row, or column.
   - **Example**: If you want to increase every value in the "Age" column by 1, you could use `df['Age'] = df['Age'].apply(lambda x: x + 1)`.

### 9. **`pivot_table()`**
   - **Purpose**: To create a pivot table, summarizing data.
   - **Example**: If you want to summarize sales data by product and region, you can use a pivot table like `df.pivot_table(index='Product', columns='Region', values='Sales', aggfunc='sum')`.

### 10. **`fillna()`**
   - **Purpose**: To fill missing values in a DataFrame.
   - **Example**: If your DataFrame has missing values that you want to fill with the mean of the column, you can use `df['Column'] = df['Column'].fillna(df['Column'].mean())`.

### Example Use Case:
**Scenario**: Suppose you have a DataFrame containing sales data with columns "Product", "Region", and "Sales". You want to find out the total sales per product.

```python
import pandas as pd

# Sample DataFrame
data = {
    'Product': ['A', 'B', 'A', 'B', 'A', 'B'],
    'Region': ['North', 'North', 'South', 'South', 'East', 'East'],
    'Sales': [100, 200, 150, 250, 130, 300]
}

df = pd.DataFrame(data)

# Grouping by 'Product' to sum sales
total_sales_per_product = df.groupby('Product')['Sales'].sum()

print(total_sales_per_product)
```

**Output**:
```
Product
A    380
B    750
Name: Sales, dtype: int64
```

Here, `groupby()` helps in summarizing the sales data by product. This is useful for understanding overall product performance across different regions.

# Q6. Which of the following is mutable in nature Series, DataFrame, Panel?

Among the three options—**Series**, **DataFrame**, and **Panel**—both **Series** and **DataFrame** are mutable in nature.

### Mutable Structures:
- **Series**: A Series is mutable, meaning you can change its data or labels after it has been created. For example, you can modify elements, append new elements, or change the index labels.
- **DataFrame**: A DataFrame is also mutable. You can add, remove, or modify columns and rows, change the index, and update the data within the DataFrame.

### Deprecated Structure:
- **Panel**: A Panel was a 3D data structure in pandas, but it has been deprecated since version 0.25.0 (July 2019). Panels were mutable as well, but their functionality has largely been replaced by using multi-indexed DataFrames or the more powerful xarray library for multi-dimensional data.

In summary, both **Series** and **DataFrame** are mutable in pandas.


# Q7. Create a DataFrame using multiple Series. Explain with an example.

You can create a DataFrame in pandas by combining multiple Series, where each Series becomes a column in the DataFrame. Here's an example to demonstrate this:

### Example

Let's create three Series for "Name", "Age", and "Gender", and then combine them into a DataFrame.

```python
import pandas as pd

# Creating individual Series
name_series = pd.Series(['Alice', 'Bob', 'Claire'])
age_series = pd.Series([25, 30, 27])
gender_series = pd.Series(['Female', 'Male', 'Female'])

# Creating a DataFrame by combining the Series
df = pd.DataFrame({
    'Name': name_series,
    'Age': age_series,
    'Gender': gender_series
})

# Printing the DataFrame
print(df)
```

### Output
```
     Name  Age  Gender
0   Alice   25  Female
1     Bob   30    Male
2  Claire   27  Female
```

### Explanation:

- **Step 1: Create Series**
  - We first create three Series, each representing a column of data. For example, `name_series` contains the names "Alice", "Bob", and "Claire".

- **Step 2: Combine Series into a DataFrame**
  - We pass these Series to the `pd.DataFrame()` constructor as a dictionary, where the keys represent the column names ("Name", "Age", "Gender") and the values are the Series themselves.

- **Step 3: Resulting DataFrame**
  - The result is a DataFrame where each column corresponds to one of the Series. The index of the DataFrame is automatically assigned, starting from 0.

This method is particularly useful when you have data spread across multiple Series and want to bring it together into a structured table format, which allows for more complex data analysis and manipulation.