In [None]:
Q1. Create a Pandas Series that contains the following data: 4, 8, 15, 16, 23, and 42. Then, print the series.

In [2]:
import pandas as pd
data = [4,8,15,16,23,42]
series = pd.Series(data)
print(series)

0     4
1     8
2    15
3    16
4    23
5    42
dtype: int64


In [None]:
Q2. Create a variable of list type containing 10 elements in it, and apply pandas.Series function on the 
variable print it.

In [3]:
l1 = [1,2,3,4,5,6,7,8,9]
series = pd.Series(l1)
print(series)

0    1
1    2
2    3
3    4
4    5
5    6
6    7
7    8
8    9
dtype: int64


In [None]:
Q3. Create a Pandas DataFrame that contains the following data:
Name	Age	Gender
Alice	25	Female
Bob	30	Male
Claire	27	Female
Then, print the DataFrame.

In [4]:
import pandas as pd

data = {'Name': ['Alice', 'Bob', 'Claire'],
    'Age': [25, 30, 27],
    'Gender': ['Female', 'Male', 'Female']}

df1 = pd.DataFrame(data)

print(df1)


     Name  Age  Gender
0   Alice   25  Female
1     Bob   30    Male
2  Claire   27  Female


In [None]:
Q4. What is ‘DataFrame’ in pandas and how is it different from pandas.series? Explain with an example.

In [None]:
In Pandas, a DataFrame is a two-dimensional, size-mutable, and potentially heterogeneous tabular data structure with labeled axes (rows and columns). It can be thought of as a spreadsheet or SQL table where you have rows and columns, and each column can contain different types of data (e.g., numbers, text, dates). DataFrames are one of the most commonly used data structures in Pandas and are particularly useful for data manipulation, analysis, and cleaning.

On the other hand, a Series is a one-dimensional labeled array in Pandas. It can be thought of as a single column of data from a DataFrame. A Series is similar to a Python list or NumPy array but comes with additional features like labeled indexes, making it more powerful and versatile for data analysis.

In [5]:
# Create a pandas Series object
fruits = pd.Series(['Apple', 'Banana', 'Cherry', 'Durian'])

# Create a pandas DataFrame object
df = pd.DataFrame({'Fruit': fruits, 'Color': ['Red', 'Yellow', 'Red', 'Green'], 'Price': [1.0, 0.5, 2.0, 3.0]})

print('Fruits Series:')
print(fruits)
print('\nDataFrame:')
print(df)

Fruits Series:
0     Apple
1    Banana
2    Cherry
3    Durian
dtype: object

DataFrame:
    Fruit   Color  Price
0   Apple     Red    1.0
1  Banana  Yellow    0.5
2  Cherry     Red    2.0
3  Durian   Green    3.0


In [None]:
Q5. What are some common functions you can use to manipulate data in a Pandas DataFrame? Can 
you give an example of when you might use one of these functions?

In [None]:
Pandas provides a wide range of functions and methods for manipulating data in a DataFrame. Here are some common functions and methods, along with examples of when you might use them:

1. Selecting Columns:
   - `DataFrame['ColumnName']` or `DataFrame.ColumnName`: You can use square brackets or dot notation to select a specific column from a DataFrame.

   ```python
   # Example: Selecting the 'Age' column
   age_column = df['Age']
   ```

2. Filtering Data:
   - `DataFrame.loc[]` or `DataFrame.iloc[]`: You can use these methods to filter rows based on conditions using label-based or integer-based indexing.

   ```python
   # Example: Selecting rows where 'Age' is greater than 25
   filtered_data = df[df['Age'] > 25]
   ```

3. Sorting Data:
   - `DataFrame.sort_values()`: This method allows you to sort the DataFrame based on one or more columns.

   ```python
   # Example: Sorting by 'Age' in ascending order
   sorted_df = df.sort_values(by='Age', ascending=True)
   ```

4. Adding New Columns:
   - You can create new columns by simply assigning values to them.

   ```python
   # Example: Adding a new column 'IsAdult' based on the 'Age' column
   df['IsAdult'] = df['Age'] >= 18
   ```

5. Grouping and Aggregating Data:
   - `DataFrame.groupby()` and various aggregation functions (e.g., `mean()`, `sum()`, `count()`): You can group data by one or more columns and compute summary statistics for each group.

   ```python
   # Example: Grouping by 'Gender' and calculating the mean age in each group
   gender_groups = df.groupby('Gender')['Age'].mean()
   ```

6. Merging and Joining Data:
   - `pd.concat()`, `DataFrame.merge()`: These functions allow you to combine multiple DataFrames based on common columns or indexes.

   ```python
   # Example: Merging two DataFrames based on a common column 'ID'
   merged_df = df1.merge(df2, on='ID', how='inner')
   ```

7. Handling Missing Data:
   - `DataFrame.dropna()`, `DataFrame.fillna()`: You can remove or fill in missing values in your DataFrame.

   ```python
   # Example: Dropping rows with missing values
   df_cleaned = df.dropna()

   # Example: Filling missing values in the 'Age' column with the mean age
   df['Age'].fillna(df['Age'].mean(), inplace=True)
   ```

8. Pivoting and Reshaping Data:
   - `DataFrame.pivot()` and `DataFrame.melt()`: These functions help in reshaping data for analysis.

   ```python
   # Example: Pivoting a DataFrame
   pivoted_df = df.pivot(index='Name', columns='Gender', values='Age')
   ```

These are just some of the common functions and methods you can use to manipulate data in a Pandas DataFrame. The choice of function depends on the specific data manipulation or analysis task you need to perform.

In [None]:
Q6. Which of the following is mutable in nature Series, DataFrame, Panel

In [None]:
Series and DataFrame in Pandas are mutable, meaning you can modify their contents, add or remove rows and columns, and update values within them.

However, it's important to note that while both Series and DataFrame are mutable, you should exercise caution when modifying them, especially if you are performing in-place modifications, as this can lead to unexpected behavior and potential data integrity issues.

The term "Panel" is no longer a part of the core Pandas library as of version 0.25.0 and has been deprecated. It was used for three-dimensional data structures, but it was removed due to limited use and complexity. If you need to work with higher-dimensional data, you can typically use multi-index DataFrames or NumPy arrays.

In [None]:
Q7. Create a DataFrame using multiple Series. Explain with an example.

In [6]:
names = pd.Series(['Alice', 'Bob', 'Claire'])
ages = pd.Series([25, 30, 27])
genders = pd.Series(['Female', 'Male', 'Female'])

df = pd.DataFrame({'Name': names,
                'Age': ages,
                'Gender': genders})
print(df)

     Name  Age  Gender
0   Alice   25  Female
1     Bob   30    Male
2  Claire   27  Female


In [7]:
In this example, we first create three Series: names, ages, and genders, each containing data for the "Name," "Age," and "Gender" columns, respectively.

Then, we use the pd.DataFrame() constructor to create a DataFrame, where each Series is specified as a column in the DataFrame. The keys in the dictionary passed to pd.DataFrame() represent the column names, and the corresponding Series are the column values.

SyntaxError: invalid syntax (3407126883.py, line 1)