# Data Manipulation (Pandas) - Problem set

**Exercise 1: Create a DataFrame of Stock Prices and Volume**

Create a DataFrame that contains stock data for three different stocks: `AAPL`, `GOOGL`, and `AMZN` over five days. The DataFrame should have the following columns: `Date`, `Stock`, `Price`, and `Volume`. Then, display the DataFrame.

In [None]:
# Your code here
import pandas as pd

# Creating a DataFrame with stock prices and volumes
data = {
    'Date': ['2023-09-01', '2023-09-02', '2023-09-03', '2023-09-04', '2023-09-05'],
    'Stock': ['AAPL', 'AAPL', 'GOOGL', 'GOOGL', 'AMZN'],
    'Price': [150.25, 153.50, 2800.50, 2830.75, 3400.00],
    'Volume': [1000, 1100, 1200, 1300, 1400]
}

df = pd.DataFrame(data)
print(df)

### Exercise 2: Calculate Percentage Change in Stock Prices

Using the DataFrame you created in Exercise 1, calculate the daily percentage change in stock prices and add a new column `Pct_Change` to store the results. Display the updated DataFrame.

In [None]:
# Your code here
df['Pct_Change'] = df['Price'].pct_change()
print(df)

### Exercise 3: Filter Stocks Based on Price

Filter the DataFrame to display only the rows where the stock price is above $2000. Display the filtered DataFrame.

In [None]:
# Your code here
filtered_df = df[df['Price'] > 2000]
print(filtered_df)

### Exercise 4: Add a New Column for Total Value

Add a new column to the DataFrame called `Total_Value`, which represents the total value of traded stock for each day (calculated as `Price * Volume`). Then display the updated DataFrame.

In [None]:
# Your code here
df['Total_Value'] = df['Price'] * df['Volume']
print(df)

### Exercise 5: Group Data by Stock and Calculate the Average Price

Group the DataFrame by the `Stock` column and calculate the average stock price for each stock. Display the resulting DataFrame with the stock symbol and its average price.

In [None]:
# Your code here
average_price = df.groupby('Stock')['Price'].mean()
print(average_price)

### Exercise 6: Generate a New DataFrame for Portfolio Analysis

Create a new DataFrame that represents a portfolio of investments. The DataFrame should contain the following columns: `Stock`, `Shares`, `Price_Per_Share`, and `Investment_Value` (calculated as `Shares * Price_Per_Share`). Populate the DataFrame with data for three stocks of your choice and display the DataFrame.

In [None]:
# Your code here
portfolio_data = {
    'Stock': ['AAPL', 'GOOGL', 'AMZN'],
    'Shares': [10, 5, 3],
    'Price_Per_Share': [150.25, 2800.50, 3400.00]
}

portfolio_df = pd.DataFrame(portfolio_data)
portfolio_df['Investment_Value'] = portfolio_df['Shares'] * portfolio_df['Price_Per_Share']
print(portfolio_df)

### Exercise 7: Sort Stocks by Price

Sort the original DataFrame (from Exercise 1) by the `Price` column in descending order and display the sorted DataFrame.

In [None]:
# Your code here
sorted_df = df.sort_values(by='Price', ascending=False)
print(sorted_df)

### Exercise 8: Calculate Moving Average of Stock Prices

For each stock, calculate the 2-day moving average of the stock prices. Create a new column `Moving_Avg` in the DataFrame and display the updated DataFrame.

In [None]:
# Your code here
df['Moving_Avg'] = df['Price'].rolling(window=2).mean()
print(df)

### Exercise 9: Replace Missing Values

Assume that some stock prices in your DataFrame are missing (NaN values). Replace the missing values with the average price of the respective stock.

In [None]:
# Your code here
# Simulating missing values
df.loc[1, 'Price'] = None

# Replacing missing values with the mean price of the respective stock
df['Price'].fillna(df['Price'].mean(), inplace=True)
print(df)

### Exercise 10: Rank Stocks by Volume

Rank the stocks based on their trading volume. Add a new column `Volume_Rank` that shows the rank of each stock based on its volume, with 1 being the highest. Display the updated DataFrame.

In [None]:
# Your code here
df['Volume_Rank'] = df['Volume'].rank(ascending=False)
print(df)

### Exercise 11: Concatenate DataFrames Vertically

Create two DataFrames representing stock prices for different sets of stocks. Concatenate them vertically (i.e., one DataFrame below the other) to form a single DataFrame. Display the concatenated DataFrame.

In [None]:
# Your code here
import pandas as pd

# DataFrame 1: Stock prices for AAPL and GOOGL
data1 = {
    'Stock': ['AAPL', 'GOOGL'],
    'Price': [150.25, 2800.50],
    'Date': ['2023-09-01', '2023-09-01']
}
df1 = pd.DataFrame(data1)

# DataFrame 2: Stock prices for AMZN and MSFT
data2 = {
    'Stock': ['AMZN', 'MSFT'],
    'Price': [3400.00, 305.50],
    'Date': ['2023-09-01', '2023-09-01']
}
df2 = pd.DataFrame(data2)

In [None]:
# Concatenating the DataFrames vertically
concat_df = pd.concat([df1, df2], ignore_index=True)
print(concat_df)

### Exercise 12: Concatenate DataFrames Horizontally

Create two DataFrames representing different attributes of stocks. Concatenate them horizontally (i.e., side-by-side) so that each stock has multiple attributes. Display the concatenated DataFrame.

In [None]:
# Your code here

# DataFrame 1: Stock names and prices
data1 = {
    'Stock': ['AAPL', 'GOOGL', 'AMZN'],
    'Price': [150.25, 2800.50, 3400.00]
}
df1 = pd.DataFrame(data1)

# DataFrame 2: Stock volumes
data2 = {
    'Volume': [1000, 1200, 1400]
}
df2 = pd.DataFrame(data2)

# Concatenating the DataFrames horizontally
concat_df = pd.concat([df1, df2], axis=1)
print(concat_df)

### Exercise 13: Merge Two DataFrames on a Common Column

Create two DataFrames: one containing stock prices and the other containing trading volumes for the same stocks. Merge the two DataFrames on the `Stock` column to combine the price and volume information. Display the merged DataFrame.

In [None]:
# Your code here

# DataFrame 1: Stock prices
prices = {
    'Stock': ['AAPL', 'GOOGL', 'AMZN'],
    'Price': [150.25, 2800.50, 3400.00]
}
df_prices = pd.DataFrame(prices)

# DataFrame 2: Stock volumes
volumes = {
    'Stock': ['AAPL', 'GOOGL', 'AMZN'],
    'Volume': [1000, 1200, 1400]
}
df_volumes = pd.DataFrame(volumes)

# Merging the DataFrames on 'Stock' column
merged_df = pd.merge(df_prices, df_volumes, on='Stock')
print(merged_df)

### Exercise 14: Merge DataFrames with Different Column Names (Using `left_on` and `right_on`)

Create two DataFrames: one containing stock prices with the column name `Ticker`, and the other containing volumes with the column name `Stock`. Merge the two DataFrames using `left_on` and `right_on` to match the appropriate columns. Display the merged DataFrame.

In [None]:
# Your code here

# DataFrame 1: Stock prices with 'Ticker' column
prices = {
    'Ticker': ['AAPL', 'GOOGL', 'AMZN'],
    'Price': [150.25, 2800.50, 3400.00]
}
df_prices = pd.DataFrame(prices)

# DataFrame 2: Stock volumes with 'Stock' column
volumes = {
    'Stock': ['AAPL', 'GOOGL', 'AMZN'],
    'Volume': [1000, 1200, 1400]
}
df_volumes = pd.DataFrame(volumes)

# Merging the DataFrames on different column names
merged_df = pd.merge(df_prices, df_volumes, left_on='Ticker', right_on='Stock')
print(merged_df)

### Exercise 15: Perform an Inner Join

Create two DataFrames, one containing stock prices for AAPL, GOOGL, and AMZN, and another containing volumes for AAPL and AMZN. Perform an inner join on the `Stock` column to include only the stocks that are common to both DataFrames. Display the result.

In [None]:
# Your code here

# DataFrame 1: Stock prices
prices = {
    'Stock': ['AAPL', 'GOOGL', 'AMZN'],
    'Price': [150.25, 2800.50, 3400.00]
}
df_prices = pd.DataFrame(prices)

# DataFrame 2: Stock volumes (only AAPL and AMZN)
volumes = {
    'Stock': ['AAPL', 'AMZN'],
    'Volume': [1000, 1400]
}
df_volumes = pd.DataFrame(volumes)

# Performing an inner join on 'Stock' column
inner_join_df = pd.merge(df_prices, df_volumes, on='Stock', how='inner')
print(inner_join_df)

### Exercise 16: Perform a Left Join

Using the same DataFrames from Exercise 15, perform a left join to include all stocks from the price DataFrame, and add the volume where it is available. If the volume is missing, the result should show `NaN`. Display the result.

In [None]:
# Your code here

# Performing a left join on 'Stock' column
left_join_df = pd.merge(df_prices, df_volumes, on='Stock', how='left')
print(left_join_df)

### Exercise 17: Perform a Right Join

Using the same DataFrames from Exercise 15, perform a right join to include all stocks from the volume DataFrame, and add the price where it is available. If the price is missing, the result should show `NaN`. Display the result.

In [None]:
# Your code here

# Performing a right join on 'Stock' column
right_join_df = pd.merge(df_prices, df_volumes, on='Stock', how='right')
print(right_join_df)

### Exercise 18: Combine Multiple DataFrames Using `concat`

Create three DataFrames that each contain stock prices for different stocks over different dates. Concatenate them vertically to form a single DataFrame with all the stock prices over time. Display the result.

In [None]:
# Your code here

# DataFrame 1: Stock prices for AAPL
data1 = {
    'Date': ['2023-09-01', '2023-09-02'],
    'Stock': ['AAPL', 'AAPL'],
    'Price': [150.25, 153.50]
}
df1 = pd.DataFrame(data1)

# DataFrame 2: Stock prices for GOOGL
data2 = {
    'Date': ['2023-09-01', '2023-09-02'],
    'Stock': ['GOOGL', 'GOOGL'],
    'Price': [2800.50, 2820.75]
}
df2 = pd.DataFrame(data2)

# DataFrame 3: Stock prices for AMZN
data3 = {
    'Date': ['2023-09-01', '2023-09-02'],
    'Stock': ['AMZN', 'AMZN'],
    'Price': [3400.00, 3420.25]
}
df3 = pd.DataFrame(data3)

# Concatenating the DataFrames
concat_df = pd.concat([df1, df2, df3], ignore_index=True)
print(concat_df)