# Q1. Create a Pandas Series that contains the following data: 4, 8, 15, 16, 23, and 42. Then, print the series.

In [1]:
import pandas as pd

# Create a Pandas Series
data = [4, 8, 15, 16, 23, 42]
my_series = pd.Series(data)

# Print the Series
print(my_series)


0     4
1     8
2    15
3    16
4    23
5    42
dtype: int64


# Q2. Create a variable of list type containing 10 elements in it, and apply pandas.Series function on the variable print it.

In [3]:
import pandas as pd

# Create a list with 10 elements
data = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

# Convert the list to a Pandas Series
my_series = pd.Series(data)

# Print the Series
print(my_series)


0     1
1     2
2     3
3     4
4     5
5     6
6     7
7     8
8     9
9    10
dtype: int64


#Q3. Create a Pandas DataFrame that contains the following data:

Name
Alice
Bob
Claire

Age
25
30
27

Gender
Female
Male
Female

Then, print the DataFrame.

In [5]:
import pandas as pd

# Given data
data = {
    'Name': ['Alice', 'Bob', 'Claire'],
    'Age': [25, 30, 27],
    'Gender': ['Female', 'Male', 'Female']
}

# Create a Pandas DataFrame
df = pd.DataFrame(data)

# Print the DataFrame
print(df)


     Name  Age  Gender
0   Alice   25  Female
1     Bob   30    Male
2  Claire   27  Female


# Q4. What is ‘DataFrame’ in pandas and how is it different from pandas.series? Explain with an example.


In Pandas, a DataFrame is a two-dimensional, tabular data structure with labeled axes (rows and columns). It is a primary data structure used for data manipulation and analysis. A DataFrame can be thought of as a container for Pandas Series. Each column in a DataFrame is essentially a Pandas Series, and the DataFrame provides a structure to organize and work with multiple Series together.

Here's a brief explanation of both DataFrame and Series:

DataFrame:

A DataFrame is a 2D data structure with rows and columns.
It can be created using various data sources, such as lists, dictionaries, NumPy arrays, or by reading data from files like CSV, Excel, SQL databases, etc.
Columns in a DataFrame can have different data types.
It is suitable for representing structured, tabular data, often resembling a spreadsheet or a SQL table.
Allows easy manipulation, filtering, and analysis of data.
Series:

A Series is a 1D labeled array capable of holding any data type.
It can be created using lists, NumPy arrays, or other data structures.
A Series has a single index associated with its values, which makes it different from a NumPy array.
It is useful for representing a single column or row of data.
Provides vectorized operations, similar to NumPy arrays.
Now, let's illustrate the difference with an example:

In [6]:
import pandas as pd

# Creating a Pandas Series
series_data = pd.Series([10, 20, 30, 40], name='MySeries')

# Creating a Pandas DataFrame
df_data = pd.DataFrame({
    'Name': ['Alice', 'Bob', 'Claire'],
    'Age': [25, 30, 27],
    'Gender': ['Female', 'Male', 'Female']
})

# Printing the Series and DataFrame
print("Pandas Series:")
print(series_data)
print("\nPandas DataFrame:")
print(df_data)


Pandas Series:
0    10
1    20
2    30
3    40
Name: MySeries, dtype: int64

Pandas DataFrame:
     Name  Age  Gender
0   Alice   25  Female
1     Bob   30    Male
2  Claire   27  Female


# Q5. What are some common functions you can use to manipulate data in a Pandas DataFrame? Can you give an example of when you might use one of these functions?

Pandas provides a wide range of functions for manipulating data in a DataFrame. Here are some common functions along with examples:

1. **`head()` and `tail()`:**
   - Display the first or last N rows of the DataFrame.
   ```python
   df.head(3)  # Display the first 3 rows
   df.tail(2)  # Display the last 2 rows
   ```

2. **`info()`:**
   - Display a concise summary of the DataFrame, including data types and missing values.
   ```python
   df.info()
   ```

3. **`describe()`:**
   - Generate descriptive statistics, including measures of central tendency and variability.
   ```python
   df.describe()
   ```

4. **`shape`:**
   - Return a tuple representing the dimensionality of the DataFrame.
   ```python
   df.shape  # Output: (rows, columns)
   ```

5. **`columns`:**
   - Return the column labels of the DataFrame.
   ```python
   df.columns  # Output: Index(['Name', 'Age', 'Gender'], dtype='object')
   ```

6. **`loc[]` and `iloc[]`:**
   - Access a group of rows and columns by labels (`loc[]`) or integer location (`iloc[]`).
   ```python
   df.loc[0, 'Name']  # Access the value in the first row and 'Name' column
   df.iloc[1, 2]  # Access the value in the second row and third column
   ```

7. **`sort_values()`:**
   - Sort the DataFrame by specified column(s).
   ```python
   df.sort_values(by='Age', ascending=False)
   ```

8. **`groupby()`:**
   - Group rows by a specified column and perform aggregation.
   ```python
   df.groupby('Gender')['Age'].mean()  # Calculate the mean age for each gender
   ```

9. **`drop()`:**
   - Drop specified labels from rows or columns.
   ```python
   df.drop(0, axis=0)  # Drop the first row
   ```

10. **`apply()`:**
    - Apply a function along the axis of the DataFrame.
    ```python
    df['Age'] = df['Age'].apply(lambda x: x + 1)  # Increment each age by 1
    ```

These are just a few examples, and Pandas offers many more functions for data manipulation. The choice of function depends on the specific task you want to perform, such as filtering data, handling missing values, merging DataFrames, and more.

# Q6. Which of the following is mutable in nature Series, DataFrame, Panel?


In Pandas, both Series and DataFrame are mutable, while Panel is deprecated and not recommended for use in recent versions of Pandas.

Series:

A Pandas Series is mutable, meaning you can modify its elements, change its size, and perform various operations on it.
Example:

In [7]:
import pandas as pd

series_data = pd.Series([10, 20, 30])
series_data[1] = 25  # Modifying an element


DataFrame:

A Pandas DataFrame is also mutable. You can modify values in specific cells, add or remove columns, and perform various operations on the DataFrame.
Example:

In [9]:
import pandas as pd

data = {'Name': ['Alice', 'Bob'], 'Age': [25, 30]}
df = pd.DataFrame(data)
df['Age'][1] = 32  # Modifying a value in the DataFrame


A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df['Age'][1] = 32  # Modifying a value in the DataFrame


Panel:

The Panel data structure was deprecated in Pandas, and its usage is discouraged. As of Pandas version 0.25.0, the Panel class is no longer actively developed or supported. Instead, multi-index DataFrames are recommended for handling three-dimensional data.
Example (deprecated):

In [None]:
import pandas as pd

# Creating a Panel (deprecated)
panel_data = pd.Panel({'Item1': df1, 'Item2': df2})


For current and future projects, it is recommended to use Series and DataFrame for handling one-dimensional and two-dimensional data, respectively. If you need to work with three-dimensional data, multi-index DataFrames provide a more flexible and widely supported solution.

# Q7. Create a DataFrame using multiple Series. Explain with an example.

You can create a DataFrame using multiple Series by combining them into a dictionary and then converting that dictionary to a DataFrame. Each Series will represent a column in the DataFrame. Here's an example:

In [11]:
import pandas as pd

# Creating multiple Pandas Series
names = pd.Series(['Alice', 'Bob', 'Claire'])
ages = pd.Series([25, 30, 27])
genders = pd.Series(['Female', 'Male', 'Female'])

# Combining the Series into a dictionary
data = {'Name': names, 'Age': ages, 'Gender': genders}

# Creating a Pandas DataFrame from the dictionary
df = pd.DataFrame(data)

# Display the resulting DataFrame
print(df)


     Name  Age  Gender
0   Alice   25  Female
1     Bob   30    Male
2  Claire   27  Female


In this example, three Pandas Series (names, ages, and genders) are created. These Series represent columns of the DataFrame. The data dictionary is then created, where each key-value pair corresponds to a column name and its associated Series. Finally, the pd.DataFrame() constructor is used to create the DataFrame (df) from the dictionary.

This way, you can easily create a DataFrame from multiple Series, and each Series becomes a column in the resulting DataFrame.