## Q1. Create a Pandas Series that contains the following data: 4, 8, 15, 16, 23, and 42. Then, print the series.

In [2]:
import pandas as pd 
data=[4,8,15,16,23,42]
series=pd.Series(data)
print(series)

0     4
1     8
2    15
3    16
4    23
5    42
dtype: int64


## Q2. Create a variable of list type containing 10 elements in it, and apply pandas.Series function on the variable print it.

In [4]:
import pandas as pd 
my_list=[1,2,3,4,5,6,7,8,9,10]
series=pd.Series(my_list)
print(my_list)

[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]


## Q3. Create a Pandas DataFrame that contains the following data:

In [7]:
import pandas as pd
data={'Name':['Alice','Bob','Claire'],
      'Age': [25,30,27],
      'Gender':['Female','Male','Female']
     }
df=pd.DataFrame(data)
print(df)

     Name  Age  Gender
0   Alice   25  Female
1     Bob   30    Male
2  Claire   27  Female


## Q4. What is ‘DataFrame’ in pandas and how is it different from pandas.series? Explain with an example.

In Pandas, a **DataFrame** is a two-dimensional labeled data structure that can store data of different types (such as numeric, string, boolean, etc.) in tabular form. It is similar to a spreadsheet or a SQL table, where you have rows and columns, and each column can have a different data type. DataFrames are one of the most commonly used data structures in Pandas and are suitable for representing structured and heterogeneous data.

On the other hand, a **Series** is a one-dimensional labeled array capable of holding data of a single data type. It's like a single column of data in a DataFrame. Series have labels (indices) that allow for quick and easy data retrieval. A DataFrame can be thought of as a collection of Series objects, where each Series represents a column.



In [8]:
import pandas as pd

# Creating a Series
series_data = pd.Series([10, 20, 30, 40, 50])
print("Series:")
print(series_data)
print("\n")

# Creating a DataFrame
data = {
    'Name': ['Alice', 'Bob', 'Charlie', 'David', 'Eve'],
    'Age': [25, 32, 28, 45, 29],
    'Salary': [60000, 80000, 70000, 90000, 75000]
}

df = pd.DataFrame(data)
print("DataFrame:")
print(df)



Series:
0    10
1    20
2    30
3    40
4    50
dtype: int64


DataFrame:
      Name  Age  Salary
0    Alice   25   60000
1      Bob   32   80000
2  Charlie   28   70000
3    David   45   90000
4      Eve   29   75000


## Q5. What are some common functions you can use to manipulate data in a Pandas DataFrame? Can you give an example of when you might use one of these functions?

Pandas provides a wide range of functions to manipulate data in a DataFrame. Here are some common functions along with examples of when you might use them:

1. **`head()` and `tail()`**: These functions allow you to view the first few rows (head) or last few rows (tail) of a DataFrame. Useful for quickly inspecting the data.

```python
import pandas as pd

data = {
    'Name': ['Alice', 'Bob', 'Charlie', 'David', 'Eve'],
    'Age': [25, 32, 28, 45, 29],
    'Salary': [60000, 80000, 70000, 90000, 75000]
}

df = pd.DataFrame(data)

print(df.head(2))  # View the first 2 rows
print(df.tail(3))  # View the last 3 rows
```

2. **`info()`**: Provides information about the DataFrame, including the data types of columns and memory usage.

```python
print(df.info())
```

3. **`describe()`**: Generates summary statistics of numerical columns, like mean, standard deviation, minimum, maximum, etc.

```python
print(df.describe())
```

4. **`sort_values()`**: Sorts the DataFrame by one or more columns.

```python
sorted_df = df.sort_values(by='Age')  # Sorting by the 'Age' column
print(sorted_df)
```

5. **`groupby()`**: Allows you to group data based on one or more columns and perform aggregate functions on the groups.

```python
grouped = df.groupby('Age')['Salary'].mean()  # Average salary for each age group
print(grouped)
```

6. **`drop()`**: Used to drop specific rows or columns from the DataFrame.

```python
modified_df = df.drop(columns='Salary')  # Remove the 'Salary' column
print(modified_df)
```

7. **`fillna()`**: Fills missing values in the DataFrame with specified values.

```python
filled_df = df.fillna(0)  # Fill NaN values with 0
print(filled_df)
```

8. **`apply()`**: Applies a function to each element, row, or column of the DataFrame.

```python
def double_salary(salary):
    return salary * 2

df['Double_Salary'] = df['Salary'].apply(double_salary)
print(df)
```

These are just a few examples of the many functions available in Pandas for data manipulation. Depending on your specific data analysis needs, you can choose the appropriate function to perform the desired operations on your DataFrame.

## Q6. Which of the following is mutable in nature Series, DataFrame, Panel?

Among the options given (Series, DataFrame, Panel), both **Series** and **DataFrame** are mutable in nature, while **Panel** is not.

1. **Series**: A Series is mutable, meaning you can change the values of individual elements after the series is created.

2. **DataFrame**: A DataFrame is also mutable. You can modify column values, add or remove columns, and perform various operations that change the data within the DataFrame.

3. **Panel**: However, Panels have been deprecated in recent versions of Pandas (as of my knowledge cutoff in September 2021) and are no longer recommended for use. Multi-dimensional data is typically handled using hierarchical indexing within DataFrames or with other data structures.

It's worth noting that while Series and DataFrames are mutable, it's a good practice to be cautious while modifying data in-place, especially when dealing with large datasets, as unintended changes can lead to data integrity issues.

## Q7. Create a DataFrame using multiple Series. Explain with an example.

In [9]:
import pandas as pd

# Create individual Series
names = pd.Series(['Alice', 'Bob', 'Charlie', 'David', 'Eve'])
ages = pd.Series([25, 32, 28, 45, 29])
salaries = pd.Series([60000, 80000, 70000, 90000, 75000])

# Combine Series into a DataFrame
data = {'Name': names, 'Age': ages, 'Salary': salaries}
df = pd.DataFrame(data)

print(df)


      Name  Age  Salary
0    Alice   25   60000
1      Bob   32   80000
2  Charlie   28   70000
3    David   45   90000
4      Eve   29   75000
