Q1. Create a Pandas Series that contains the following data: 4, 8, 15, 16, 23, and 42. Then, print the series.

In [4]:
import pandas as pd

# Create a Pandas Series with the given data
data = [4, 8, 15, 16, 23, 42]
series = pd.Series(data)

# Print the series
print(series)

0     4
1     8
2    15
3    16
4    23
5    42
dtype: int64


Q2. Create a variable of list type containing 10 elements in it, and apply pandas.Series function on the variable print it.

In [5]:
import pandas as pd

# Create a list containing 10 elements
my_list = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

# Convert the list into a Pandas Series
series = pd.Series(my_list)

# Print the series
print(series)

0     1
1     2
2     3
3     4
4     5
5     6
6     7
7     8
8     9
9    10
dtype: int64


Create a Pandas DataFrame that contains the following data:

Name
Alice
Bob
Claire

Age
25
30
27

Gender
Female
Male
Female

Then, print the DataFrame.

In [6]:
import pandas as pd

# Create a dictionary with the given data
data = {
    'Name': ['Alice', 'Bob', 'Claire'],
    'Age': [25, 30, 27],
    'Gender': ['Female', 'Male', 'Female']
}

# Create a Pandas DataFrame from the dictionary
df = pd.DataFrame(data)

# Print the DataFrame
print(df)

     Name  Age  Gender
0   Alice   25  Female
1     Bob   30    Male
2  Claire   27  Female


Q4. What is ‘DataFrame’ in pandas and how is it different from pandas.series? Explain with an example.

In Pandas, a DataFrame is a two-dimensional labeled data structure capable of holding data of different types (e.g., integers, floats, strings, etc.) in columns. It is similar to a spreadsheet or SQL table, where data is organized into rows and columns. Each column in a DataFrame can be of a different data type. DataFrames allow for various operations such as data manipulation, filtering, merging, and analysis.

On the other hand, a Pandas Series is a one-dimensional labeled array capable of holding data of a single data type (e.g., integers, floats, strings, etc.). A Series is essentially a single column of data with associated row labels or indices. Series are similar to Python lists but with additional functionality and flexibility.

Here's an example to illustrate the difference between a DataFrame and a Series:

```python
import pandas as pd

# Creating a Pandas Series
series_data = pd.Series([1, 2, 3, 4, 5])
print("Series:")
print(series_data)
print("\nType of series_data:", type(series_data))

# Creating a Pandas DataFrame
data = {
    'A': [1, 2, 3],
    'B': [4, 5, 6],
    'C': [7, 8, 9]
}
df = pd.DataFrame(data)
print("\nDataFrame:")
print(df)
print("\nType of df:", type(df))
```

Output:
```
Series:
0    1
1    2
2    3
3    4
4    5
dtype: int64

Type of series_data: <class 'pandas.core.series.Series'>

DataFrame:
   A  B  C
0  1  4  7
1  2  5  8
2  3  6  9

Type of df: <class 'pandas.core.frame.DataFrame'>
```

In the example above, `series_data` is a Pandas Series containing integers from 1 to 5. It has a single column of data and indices starting from 0.

On the other hand, `df` is a Pandas DataFrame containing three columns labeled 'A', 'B', and 'C'. Each column has its own data, and rows are indexed from 0 to 2. The DataFrame structure allows for more complex data organization and manipulation compared to a Series.

Q5. What are some common functions you can use to manipulate data in a Pandas DataFrame? Can you give an example of when you might use one of these functions?

There are numerous functions available in Pandas for manipulating data in a DataFrame. Some common functions include:

1. `head()` and `tail()`: These functions allow you to view the first or last few rows of a DataFrame, respectively. They are useful for quickly inspecting the structure of your data.

2. `info()`: This function provides a concise summary of the DataFrame including the data types of each column and information about missing values.

3. `describe()`: This function generates descriptive statistics for numeric columns in the DataFrame such as count, mean, standard deviation, minimum, maximum, and quartiles.

4. `drop()`: This function allows you to drop rows or columns from the DataFrame based on labels or indices. It's useful for removing unnecessary data.

5. `fillna()`: This function allows you to fill missing values in the DataFrame with specified values or methods such as mean, median, or mode.

6. `groupby()`: This function is used to group data in the DataFrame based on one or more columns and perform aggregate functions such as sum, mean, count, etc. It's useful for data analysis and summarization.

7. `merge()` and `concat()`: These functions are used to combine multiple DataFrames either by merging on common columns or concatenating along rows or columns.

8. `apply()`: This function applies a function along an axis of the DataFrame. It's useful for applying custom functions to each row or column.

9. `sort_values()`: This function sorts the DataFrame by the values of one or more columns. It's useful for arranging data in ascending or descending order.

10. `pivot_table()`: This function creates a spreadsheet-style pivot table from the DataFrame. It's useful for reshaping and summarizing data.

Example:
Suppose you have a DataFrame containing sales data with a 'Product' column and a 'Sales' column. You want to calculate the total sales for each product. You can use the `groupby()` function to group the data by 'Product' and then apply the `sum()` function to calculate the total sales for each product.

```python
import pandas as pd

# Sample sales data
data = {
    'Product': ['A', 'B', 'A', 'C', 'B', 'C'],
    'Sales': [100, 150, 200, 120, 180, 90]
}

df = pd.DataFrame(data)

# Group by 'Product' and calculate total sales for each product
total_sales = df.groupby('Product')['Sales'].sum()

print(total_sales)
```

Output:
```
Product
A    300
B    330
C    210
Name: Sales, dtype: int64
```

In this example, the `groupby()` function is used to group the DataFrame by 'Product', and the `sum()` function is applied to calculate the total sales for each product. This is a common use case for data analysis and summarization in Pandas.

Q6. Which of the following is mutable in nature Series, DataFrame, Panel?

Among Series, DataFrame, and Panel, only DataFrame is mutable in nature.

- **Series**: Series is immutable in nature, meaning once created, you cannot change the elements of a Series. However, you can create a new Series with updated values.

- **DataFrame**: DataFrame is mutable, meaning you can modify the values, add or remove columns, and perform various operations to change the DataFrame's structure or content.

- **Panel**: Panel is also mutable, as it's a three-dimensional data structure containing multiple DataFrame-like objects. You can modify the content of individual DataFrames within a Panel, as well as add or remove items. However, Panels have been deprecated since Pandas version 0.25.0, and you're encouraged to use other data structures like MultiIndex DataFrames instead.

Q7. Create a DataFrame using multiple Series. Explain with an example.

You can create a DataFrame using multiple Series by passing them as a dictionary to the `pd.DataFrame()` constructor. Each Series will become a column in the DataFrame, and they must have the same length. Here's an example:

```python
import pandas as pd

# Creating multiple Series
series1 = pd.Series([1, 2, 3, 4, 5], name='A')
series2 = pd.Series(['a', 'b', 'c', 'd', 'e'], name='B')

# Creating a DataFrame using the multiple Series
df = pd.DataFrame({'Column_A': series1, 'Column_B': series2})

# Print the DataFrame
print(df)
```

Output:
```
   Column_A Column_B
0         1        a
1         2        b
2         3        c
3         4        d
4         5        e
```

In this example:
- We create two Pandas Series, `series1` and `series2`.
- We then create a DataFrame, `df`, by passing a dictionary where the keys are the column names and the values are the corresponding Series.
- Finally, we print the DataFrame, which contains two columns ('Column_A' and 'Column_B') with data from the respective Series.

This demonstrates how to create a DataFrame using multiple Series. Each Series becomes a column in the DataFrame, and the index of each Series becomes the index of the DataFrame.