## Q1. Create a Pandas Series that contains the following data: 4, 8, 15, 16, 23, and 42. Then, print the series.

In [1]:
import pandas as a

data=[4,8,15,16,23,42]
df=a.Series(data)
print(df)

0     4
1     8
2    15
3    16
4    23
5    42
dtype: int64


## Q2. Create a variable of list type containing 10 elements in it, and apply pandas.Series function on the variable print it.

In [2]:
import pandas as b

lst=[1,2,3,4,5,6,7,8,9,10]
df=b.Series(lst)
print(df)

0     1
1     2
2     3
3     4
4     5
5     6
6     7
7     8
8     9
9    10
dtype: int64


## Q3. Create a Pandas DataFrame that contains the following data:

![image.png](attachment:84c36e84-a726-46cd-9398-ddb618134daa.png)

In [3]:
import pandas as c

data={"Name": ["Alice", "Bob", "Claire"],
      "Age" : [25,30,27],
      "Gender" : ["Female","Male","Female"]
     }
df=c.DataFrame(data)
print(df)

     Name  Age  Gender
0   Alice   25  Female
1     Bob   30    Male
2  Claire   27  Female


## Q4. What is ‘DataFrame’ in pandas and how is it different from pandas.series? Explain with an example.

In pandas, a DataFrame is a two-dimensional labeled data structure that consists of columns of various data types. It is similar to a spreadsheet or a SQL table, where each column can be a different type (e.g., integer, float, string) and has a label. The DataFrame can also have row labels, which are called an index.

On the other hand, a pandas Series is a one-dimensional labeled array that can contain any data type, including numeric, character, and even other Python objects. Each element in a Series has a label, which is called an index.

The main difference between a DataFrame and a Series is that a DataFrame is a two-dimensional object, while a Series is a one-dimensional object. Additionally, a DataFrame can have multiple columns, while a Series has only one column.

In [5]:
import pandas as pd

# create a dictionary with data for a DataFrame
data = {'name': ['John', 'Alice', 'Bob', 'Lisa'],
        'age': [25, 30, 35, 40],
        'gender': ['M', 'F', 'M', 'F']}

# create a DataFrame
df = pd.DataFrame(data)

# print the DataFrame
print(df)


    name  age gender
0   John   25      M
1  Alice   30      F
2    Bob   35      M
3   Lisa   40      F


In [6]:
# create a dictionary with data for a Series
data = {'John': 25, 'Alice': 30, 'Bob': 35, 'Lisa': 40}

# create a Series
s = pd.Series(data)

# print the Series
print(s)

John     25
Alice    30
Bob      35
Lisa     40
dtype: int64


## Q5. What are some common functions you can use to manipulate data in a Pandas DataFrame? Can you give an example of when you might use one of these functions?

Pandas provides a wide range of functions for manipulating data in a DataFrame. Here are some common functions you can use:

`head()` and `tail()`: these functions can be used to get the first or last n rows of the DataFrame. For example, `df.head()` returns the first 5 rows of the DataFrame.


In [9]:
# Example usage of head() function
import pandas as pd

# Create a DataFrame
data = {'Name': ['Alice', 'Bob', 'Claire'],
        'Age': [25, 30, 27],
        'Gender': ['Female', 'Male', 'Female']}
df = pd.DataFrame(data)

# View the first two rows of the DataFrame
print(df.head(2))

    Name  Age  Gender
0  Alice   25  Female
1    Bob   30    Male




`describe()`: this function provides summary statistics for each column of the DataFrame, including count, mean, standard deviation, minimum, maximum, and quartile values.


In [10]:
# Example usage of describe() function
import pandas as pd

# Create a DataFrame
data = {'Name': ['Alice', 'Bob', 'Claire'],
        'Age': [25, 30, 27],
        'Gender': ['Female', 'Male', 'Female']}
df = pd.DataFrame(data)

# Compute summary statistics of the DataFrame
print(df.describe())

             Age
count   3.000000
mean   27.333333
std     2.516611
min    25.000000
25%    26.000000
50%    27.000000
75%    28.500000
max    30.000000


`groupby()`: this function allows you to group the DataFrame by one or more columns and perform operations on each group. For example, you can group a DataFrame of sales data by product and compute the total revenue for each product.


In [11]:
# Example usage of groupby() function
import pandas as pd

# Create a DataFrame
data = {'Name': ['Alice', 'Bob', 'Claire', 'David', 'Eva', 'Frank'],
        'Age': [25, 30, 27, 25, 35, 32],
        'Gender': ['Female', 'Male', 'Female', 'Male', 'Female', 'Male']}
df = pd.DataFrame(data)

# Group the DataFrame by 'Gender' column and compute the mean of 'Age' column for each group
grouped_df = df.groupby('Gender')['Age'].mean()
print(grouped_df)

Gender
Female    29.0
Male      29.0
Name: Age, dtype: float64




`apply()`: this function applies a function to each element of a DataFrame. For example, you can use the apply() function to apply a function to each value in a column to convert it to a different data type.


In [16]:
import pandas as pd

# create a DataFrame
data = {'name': ['John', 'Alice', 'Bob', 'Lisa'],
        'age': [25, 30, 35, 40],
        'gender': ['M', 'F', 'M', 'F']}
df = pd.DataFrame(data)

# define a function to apply
def square(x):
    return x**2

# apply the function to the 'age' column
df['age_squared'] = df['age'].apply(square)

# print the modified DataFrame
print(df)


    name  age gender  age_squared
0   John   25      M          625
1  Alice   30      F          900
2    Bob   35      M         1225
3   Lisa   40      F         1600




`merge()`: this function can be used to combine two DataFrames based on a common column. For example, you can merge two DataFrames of customer data and sales data based on a common customer ID column.


In [17]:
import pandas as pd

# create two DataFrames
df1 = pd.DataFrame({'key': ['A', 'B', 'C', 'D'],
                   'value': [1, 2, 3, 4]})
df2 = pd.DataFrame({'key': ['B', 'D', 'E', 'F'],
                   'value': [5, 6, 7, 8]})

# merge the two DataFrames
merged_df = pd.merge(df1, df2, on='key')

# print the merged DataFrame
print(merged_df)


  key  value_x  value_y
0   B        2        5
1   D        4        6



`pivot_table()`: this function creates a spreadsheet-style pivot table based on the data in a DataFrame. For example, you can create a pivot table to show the total sales for each product and each quarter of the year.


In [18]:
import pandas as pd

# create a DataFrame
data = {'name': ['John', 'Alice', 'Bob', 'Lisa'],
        'age': [25, 30, 35, 40],
        'gender': ['M', 'F', 'M', 'F'],
        'score': [80, 90, 75, 85]}
df = pd.DataFrame(data)

# create a pivot table
pivot = pd.pivot_table(df, values='score', index='name', columns='gender')

# print the pivot table
print(pivot)


gender     F     M
name              
Alice   90.0   NaN
Bob      NaN  75.0
John     NaN  80.0
Lisa    85.0   NaN


## Q6. Which of the following is mutable in nature Series, DataFrame, Panel?

In Pandas, DATA MUTABILITY refers to the ability to change the values within a data structure, while SIZE MUTABILITY refers to the ability to change the size or shape of the data structure itself.

All three, that is Series, DataFrame and Panel are DATA MUTABLE, meaning that you can change the values within them, but only DataFrame and Panel are SIZE MUTABLE, meaning that you can change their shape or size.

## Q7. Create a DataFrame using multiple Series. Explain with an example.
You can create a Pandas DataFrame using multiple Series by passing a dictionary of Series objects to the DataFrame constructor. Each Series represents a column in the resulting DataFrame, and the index of each Series is used to align the data. Here's an example:

In [13]:
import pandas as pd

# create three Series objects
names = pd.Series(['Alice', 'Bob', 'Claire'])
ages = pd.Series([25, 30, 27])
genders = pd.Series(['Female', 'Male', 'Female'])

# combine the Series objects into a dictionary
data = {'Name': names, 'Age': ages, 'Gender': genders}

# create the DataFrame
df = pd.DataFrame(data)

# print the DataFrame
print(df)

     Name  Age  Gender
0   Alice   25  Female
1     Bob   30    Male
2  Claire   27  Female
