### Q1. Create a Pandas Series that contains the following data: 4, 8, 15, 16, 23, and 42. Then, print the series.

In [1]:
import pandas as pd

In [2]:
data=[4,8,15,16,23,42]
series=pd.Series(data)

In [3]:
print(series)

0     4
1     8
2    15
3    16
4    23
5    42
dtype: int64


### Q2. Create a variable of list type containing 10 elements in it, and apply pandas.Series function on the variable print it.

In [4]:
import pandas as pd 

In [7]:
my_list=[1,2,3,4,5,6,7,8,9,10]
series=pd.Series(my_list)


In [8]:
print(series)

0     1
1     2
2     3
3     4
4     5
5     6
6     7
7     8
8     9
9    10
dtype: int64


###  ![Screenshot 2024-05-06 043214.png](attachment:546a7e2a-1b43-47b6-abb9-dfd51e4ef892.png)

In [23]:
import pandas as pd 
data={"Name":["Alice","Bob","Claire"],
       "Age":[25,30,27],
       "Gender":["Female","Male","Female"]
     }
df=pd.DataFrame(data)
print(df)


     Name  Age  Gender
0   Alice   25  Female
1     Bob   30    Male
2  Claire   27  Female


### Q4. What is ‘DataFrame’ in pandas and how is it different from pandas.series? Explain with an example.

In pandas, a DataFrame is a two-dimensional labeled data structure capable of holding data of various types (e.g., integer, float, string, etc.) in columns. It is similar to a spreadsheet or SQL table, where data is organized into rows and columns. Each column in a DataFrame is a Pandas Series, making a DataFrame essentially a collection of Series that share the same index.

On the other hand, a Pandas Series is a one-dimensional labeled array capable of holding data of any type (e.g., integer, float, string, etc.). It is essentially a single column of data with associated labels, referred to as the index. Series are the building blocks of a DataFrame.

Here's a simple example to illustrate the difference:

```python
import pandas as pd

# Creating a Series
series = pd.Series([1, 2, 3, 4, 5], name='Numbers')

# Creating a DataFrame
data = {
    'A': [1, 2, 3],
    'B': [4, 5, 6],
    'C': [7, 8, 9]
}
df = pd.DataFrame(data)

# Printing the Series and DataFrame
print("Series:")
print(series)
print("\nDataFrame:")
print(df)
```

Output:

```
Series:
0    1
1    2
2    3
3    4
4    5
Name: Numbers, dtype: int64

DataFrame:
   A  B  C
0  1  4  7
1  2  5  8
2  3  6  9
```

In the example above, `series` is a Pandas Series containing integers with labels (indices) from 0 to 4. Whereas, `df` is a DataFrame consisting of three columns labeled 'A', 'B', and 'C'. Each column in the DataFrame is a Pandas Series (`df['A']`, `df['B']`, `df['C']`) containing integers with indices 0, 1, and 2.

### Q5. What are some common functions you can use to manipulate data in a Pandas DataFrame? Can you give an example of when you might use one of these functions?

There are several common functions and methods available in Pandas for manipulating data in a DataFrame. Some of these include:

1. **head() and tail()**: These functions are used to view the first or last n rows of the DataFrame, respectively. They are helpful for quickly inspecting the structure and contents of the DataFrame.

2. **info() and describe()**: `info()` provides a concise summary of the DataFrame including its data types, memory usage, and non-null values per column. `describe()` computes summary statistics for numerical columns such as mean, median, min, max, and quartiles.

3. **shape**: This attribute returns a tuple representing the dimensions of the DataFrame (number of rows, number of columns).

4. **loc[] and iloc[]**: These are used for label-based and integer-based indexing, respectively. You can use them to access specific rows or columns of the DataFrame.

5. **drop()**: This function is used to remove rows or columns from the DataFrame based on labels or indices.

6. **fillna()**: This function is used to fill missing values in the DataFrame with specified values.

7. **groupby()**: This function is used to group the DataFrame by one or more columns and perform aggregations on the grouped data.

8. **sort_values()**: This function is used to sort the DataFrame by specified column(s).

9. **apply() and applymap()**: These functions are used to apply a function along an axis of the DataFrame or element-wise on the DataFrame, respectively.

10. **merge() and concat()**: These functions are used to combine multiple DataFrames either by merging them based on common columns (`merge()`) or by concatenating them along a specified axis (`concat()`).

Example:
Suppose you have a DataFrame containing sales data and you want to calculate the total sales amount for each product category. You can use the `groupby()` function to group the DataFrame by the 'Product Category' column and then apply the `sum()` function to calculate the total sales amount for each category.

```python
import pandas as pd

# Sample DataFrame
data = {
    'Product Category': ['A', 'B', 'A', 'C', 'B', 'C'],
    'Sales Amount': [100, 200, 150, 300, 250, 200]
}
df = pd.DataFrame(data)

# Grouping by 'Product Category' and calculating total sales amount
total_sales_by_category = df.groupby('Product Category')['Sales Amount'].sum()

print(total_sales_by_category)
```

Output:
```
Product Category
A    250
B    450
C    500
Name: Sales Amount, dtype: int64
```

This gives you the total sales amount for each product category.

### Q6. Which of the following is mutable in nature Series, DataFrame, Panel?

Among the options provided (Series, DataFrame, Panel), both Series and DataFrame are mutable in nature, whereas Panel is not.

- **Series**: A Pandas Series is mutable, meaning you can modify its values after creation. You can change, add, or delete elements in a Series.

- **DataFrame**: Similarly, a Pandas DataFrame is mutable. You can modify its contents, add or remove columns, and perform various data manipulation operations on it.

- **Panel**: However, Panel is not mutable. Panel was a data structure in Pandas for handling three-dimensional data, but it has been deprecated since version 0.25.0, and its functionality has been significantly limited. Therefore, it's not common to encounter Panels in recent versions of Pandas, and they are not mutable.

So, the correct answer is **Series** and **DataFrame**.

### Q7. Create a DataFrame using multiple Series. Explain with an example.

We can create a DataFrame using multiple Pandas Series by passing them as a dictionary to the `pd.DataFrame()` constructor. Each Series will become a column in the resulting DataFrame.

Here's an example:

```python
import pandas as pd

# Creating Pandas Series
name_series = pd.Series(["Alice", "Bob", "Claire"])
age_series = pd.Series([25, 30, 27])
gender_series = pd.Series(["Female", "Male", "Female"])

# Creating DataFrame using multiple Series
data = {
    "Name": name_series,
    "Age": age_series,
    "Gender": gender_series
}
df = pd.DataFrame(data)

# Printing the DataFrame
print(df)
```

Output:
```
     Name  Age  Gender
0   Alice   25  Female
1     Bob   30    Male
2  Claire   27  Female
```

In this example, we have three Pandas Series: `name_series`, `age_series`, and `gender_series`, each containing data for the 'Name', 'Age', and 'Gender' columns, respectively. We then create a dictionary `data` where the keys are column names and the values are the corresponding Series. Finally, we pass this dictionary to the `pd.DataFrame()` constructor to create the DataFrame `df`.