In [1]:
# Q1. Create a Pandas Series that contains the following data: 4, 8, 15, 16, 23, and 42. Then, print the series.

### Solution 1-

In [2]:
import pandas as pd

In [3]:
data = [4,8,15,16,23,42]

In [4]:
series = pd.Series(data)

In [5]:
print(series)

0     4
1     8
2    15
3    16
4    23
5    42
dtype: int64


In [6]:
# Q2. Create a variable of list type containing 10 elements in it, and apply pandas.Series function on the
# variable print it.

### Solution 2-

In [7]:
l = [1,2,3,4,5,6,7,8,9,10]

In [8]:
series = pd.Series(l)

In [9]:
print(series)

0     1
1     2
2     3
3     4
4     5
5     6
6     7
7     8
8     9
9    10
dtype: int64


In [10]:
# Q3. Create a Pandas DataFrame that contains the following data:
#      Name  Age  Gender
# 0   Alice   25  Female
# 1     Bob   30    Male
# 2  Claire   27  Female

### Solution 3-

In [11]:
data = {
        "Name": ["Alice","Bob","Claire"],
         "Age": [25,30,27],
         "Claire": ["Female","Male","Female"]
}

In [12]:
df = pd.DataFrame(data)

In [13]:
print(df)

     Name  Age  Claire
0   Alice   25  Female
1     Bob   30    Male
2  Claire   27  Female


In [14]:
# Q4. What is ‘DataFrame’ in pandas and how is it different from pandas.series? Explain with an example.

### Solutuion 4-

<span style = 'font-size:0.8em;'>
    
In Pandas, a DataFrame is a two-dimensional, size-mutable, and potentially heterogeneous tabular data structure with labeled axes (rows and columns). It is essentially a table or spreadsheet-like data structure containing an ordered collection of columns, each of which can be a different data type (integer, float, string, etc.). DataFrames are highly versatile and offer functionalities for data manipulation, cleaning, analysis, and visualization.
    

On the other hand, a Pandas Series is a one-dimensional labeled array capable of holding data of any type (integer, float, string, etc.). It can be thought of as a single column of a DataFrame, and it shares many characteristics with NumPy arrays. However, Series has additional features such as axis labels, which allow for more intuitive data access and manipulation.
    
</span>

In [15]:
# Example
# Creating a DataFrame
data = {
    'Name': ['Alice', 'Bob', 'Claire'],
    'Age': [25, 30, 27],
    'Gender': ['Female', 'Male', 'Female']
}
df = pd.DataFrame(data)
print("DataFrame:")
print(df)
print()

# Accessing a Series (a single column) from the DataFrame
age_series = df['Age']
print("Series (Age column from DataFrame):")
print(age_series)

DataFrame:
     Name  Age  Gender
0   Alice   25  Female
1     Bob   30    Male
2  Claire   27  Female

Series (Age column from DataFrame):
0    25
1    30
2    27
Name: Age, dtype: int64


<span style = 'font-size:0.8em;'>
    
In the example above, df is a DataFrame containing three columns ('Name', 'Age', 'Gender'). We access the 'Age' column using square brackets (df['Age']), which returns a Series containing the ages of individuals. So, while a DataFrame represents tabular data with rows and columns, a Series represents a single column of data with an index.
</span>


In [16]:
# Q5. What are some common functions you can use to manipulate data in a Pandas DataFrame? Can
# you give an example of when you might use one of these functions?

### Solution 5-

<span style = 'font-size:0.8em;'>


Pandas provides a wide range of functions for data manipulation in a DataFrame. Some common functions include:

1. **`head()` and `tail()`**: These functions are used to view the first or last few rows of a DataFrame, respectively. They are helpful for quickly inspecting the structure and contents of the DataFrame.

2. **`info()` and `describe()`**: `info()` provides a concise summary of the DataFrame including the data types and number of non-null values in each column, while `describe()` computes summary statistics for numerical columns (e.g., count, mean, min, max, etc.).

3. **`set_index()` and `reset_index()`**: `set_index()` is used to set one or more columns as the index of the DataFrame, while `reset_index()` resets the index, optionally turning it into a column.

4. **`loc[]` and `iloc[]`**: These are used for selection and indexing of rows and columns. `loc[]` is label-based, while `iloc[]` is integer-based.

5. **`drop()`**: This function is used to remove rows or columns from the DataFrame based on labels or index.

6. **`fillna()` and `dropna()`**: These functions are used for handling missing data. `fillna()` fills missing values with specified values, while `dropna()` drops rows or columns containing missing values.

7. **`groupby()`**: This function is used to group data based on one or more columns and perform operations on the grouped data.

8. **`merge()` and `concat()`**: These functions are used for combining DataFrames. `merge()` is used to merge DataFrames similar to SQL joins, while `concat()` is used to concatenate DataFrames along rows or columns.

9. **`apply()`**: This function applies a function along an axis of the DataFrame.

10. **`sort_values()` and `sort_index()`**: These functions are used for sorting data either by values or index labels.
</span>


In [17]:
# Here's an example of using one of these functions: 
data = {
    'Date': ['2024-05-01', '2024-05-01', '2024-05-02', '2024-05-02'],
    'Product': ['A', 'B', 'A', 'B'],
    'Sales': [100, 200, 150, 250]
}
df = pd.DataFrame(data)

# Grouping by 'Product' and calculating total sales
total_sales = df.groupby('Product')['Sales'].sum()

print(total_sales)

Product
A    250
B    450
Name: Sales, dtype: int64


<span style = 'font-size:0.8em;'>
    
In this example, `groupby('Product')['Sales'].sum()` groups the DataFrame by the 'Product' column and calculates the sum of 'Sales' for each product, giving us the total sales for each product.
</span>

In [18]:
# Q6. Which of the following is mutable in nature Series, DataFrame, Panel?

### Solution 6-

<span style = 'font-size:0.8em;'>

Among Series, DataFrame, and Panel in Pandas, only DataFrame is mutable in nature. 

- Series: Series objects are immutable, meaning you cannot modify their values after creation. You can, however, modify the index labels.
- DataFrame: DataFrame objects are mutable, meaning you can modify their values, add or remove columns, and perform other operations to alter the DataFrame's structure and content.
- Panel: Panel objects were deprecated in Pandas 0.25.0 and removed in Pandas 1.0.0. They were three-dimensional data structures and were also mutable. However, since they are deprecated and removed, they are no longer relevant for current versions of Pandas.

Hence, **DataFrame** is mutable in nature
</span>

In [19]:
# Here's an example demonstrating how we can modify a DataFrame, showing its mutable nature:

# Create a DataFrame
data = {
    'Name': ['Alice', 'Bob', 'Claire'],
    'Age': [25, 30, 27],
    'Gender': ['Female', 'Male', 'Female']
}
df = pd.DataFrame(data)

print("Original DataFrame:")
print(df)
print()

# Modify the DataFrame by adding a new column
df['Country'] = ['USA', 'UK', 'Canada']

print("Modified DataFrame (added 'Country' column):")
print(df)


Original DataFrame:
     Name  Age  Gender
0   Alice   25  Female
1     Bob   30    Male
2  Claire   27  Female

Modified DataFrame (added 'Country' column):
     Name  Age  Gender Country
0   Alice   25  Female     USA
1     Bob   30    Male      UK
2  Claire   27  Female  Canada


In [20]:
# Q7. Create a DataFrame using multiple Series. Explain with an example.

### Solution 7-

<span style = 'font-size:0.8em;'>
    
We can create a DataFrame using multiple Series by passing them as a dictionary to the pd.DataFrame(). Each Series will represent a column in the DataFrame, and their indexes should match if we want them to align correctly.
</span>


In [21]:
# Here's an example:

# Create multiple Series
name_series = pd.Series(['Alice', 'Bob', 'Claire'])
age_series = pd.Series([25, 30, 27])
gender_series = pd.Series(['Female', 'Male', 'Female'])

# Create a DataFrame using the Series
data = {
    'Name': name_series,
    'Age': age_series,
    'Gender': gender_series
}
df = pd.DataFrame(data)

# Print the DataFrame
print(df)


     Name  Age  Gender
0   Alice   25  Female
1     Bob   30    Male
2  Claire   27  Female
