**QUESTIONS**
Q1. Load the flight price dataset and examine its dimensions. How many rows and columns does the
dataset have?

Q2. What is the distribution of flight prices in the dataset? Create a histogram to visualize the
distribution.

Q3. What is the range of prices in the dataset? What is the minimum and maximum price?

Q4. How does the price of flights vary by airline? Create a boxplot to compare the prices of different
airlines.

Q5. Are there any outliers in the dataset? Identify any potential outliers using a boxplot and describe how
they may impact your analysis.

Q6. You are working for a travel agency, and your boss has asked you to analyze the Flight Price dataset
to identify the peak travel season. What features would you analyze to identify the peak season, and how
would you present your findings to your boss?

Q7. You are a data analyst for a flight booking website, and you have been asked to analyze the Flight
Price dataset to identify any trends in flight prices. What features would you analyze to identify these
trends, and what visualizations would you use to present your findings to your team?

Q8. You are a data scientist working for an airline company, and you have been asked to analyze the
Flight Price dataset to identify the factors that affect flight prices. What features would you analyze to
identify these factors, and how would you present your findings to the management team?

Q9. Load the Google Playstore dataset and examine its dimensions. How many rows and columns does
the dataset have?

Q10. How does the rating of apps vary by category? Create a boxplot to compare the ratings of different
app categories.

Q11. Are there any missing values in the dataset? Identify any missing values and describe how they may
impact your analysis.

Q12. What is the relationship between the size of an app and its rating? Create a scatter plot to visualize
the relationship.

Q13. How does the type of app affect its price? Create a bar chart to compare average prices by app type.

Q14. What are the top 10 most popular apps in the dataset? Create a frequency table to identify the apps
with the highest number of installs.

Q15. A company wants to launch a new app on the Google Playstore and has asked you to analyze the
Google Playstore dataset to identify the most popular app categories. How would you approach this
task, and what features would you analyze to make recommendations to the company?

Q16. A mobile app development company wants to analyze the Google Playstore dataset to identify the
most successful app developers. What features would you analyze to make recommendations to the
company, and what data visualizations would you use to present your findings?

Q17. A marketing research firm wants to analyze the Google Playstore dataset to identify the best time to
launch a new app. What features would you analyze to make recommendations to the company, and
what data visualizations would you use to present your findings?

**ANSWERS**

### Flight Price Dataset:

```python
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

# Q1
flight_data = pd.read_csv('flight_price_dataset.csv')
print("Dimensions of the Flight Price dataset:", flight_data.shape)

# Q2
plt.figure(figsize=(10, 6))
sns.histplot(flight_data['Price'], bins=30, kde=True)
plt.title('Distribution of Flight Prices')
plt.show()

# Q3
price_range = flight_data['Price'].min(), flight_data['Price'].max()
print("Range of flight prices:", price_range)

# Q4
plt.figure(figsize=(12, 8))
sns.boxplot(x='Airline', y='Price', data=flight_data)
plt.title('Flight Prices by Airline')
plt.xticks(rotation=45)
plt.show()

# Q5
plt.figure(figsize=(8, 6))
sns.boxplot(x=flight_data['Price'])
plt.title('Boxplot of Flight Prices')
plt.show()

```
```python
#Q6.
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

flight_data['Date_of_Journey'] = pd.to_datetime(flight_data['Date_of_Journey'])
flight_data['Month'] = flight_data['Date_of_Journey'].dt.month

# Visualize the average prices per month
plt.figure(figsize=(12, 8))
sns.lineplot(x='Month', y='Price', data=flight_data, estimator='mean', marker='o')
plt.title('Average Flight Prices by Month')
plt.xlabel('Month')
plt.ylabel('Average Price')
plt.show()
```

```python
#7
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

# Example: Analyzing trends by Airline
plt.figure(figsize=(12, 8))
sns.lineplot(x='Date_of_Journey', y='Price', hue='Airline', data=flight_data, estimator='mean', marker='o')
plt.title('Flight Price Trends by Airline')
plt.xlabel('Date of Journey')
plt.ylabel('Average Price')
plt.legend(loc='upper right')
plt.show()

```


```python
#8
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

# Example: Analyzing factors by Airline
plt.figure(figsize=(12, 8))
sns.boxplot(x='Airline', y='Price', hue='Total_Stops', data=flight_data)
plt.title('Factors Affecting Flight Prices by Airline')
plt.xlabel('Airline')
plt.ylabel('Price')
plt.xticks(rotation=45)
plt.legend(title='Total Stops', loc='upper right')
plt.show()
```

### Google Playstore Dataset:

```python

# Q9
print("Dimensions of the Google Playstore dataset:", google_playstore_data.shape)

# Q10
plt.figure(figsize=(12, 8))
sns.boxplot(x='Category', y='Rating', data=google_playstore_data)
plt.title('App Ratings by Category')
plt.xticks(rotation=45)
plt.show()

# Q11
missing_values = google_playstore_data.isnull().sum()
print("Missing values in the dataset:\n", missing_values)

# Q12
plt.figure(figsize=(10, 6))
sns.scatterplot(x='Size', y='Rating', data=google_playstore_data)
plt.title('Relationship between App Size and Rating')
plt.show()

# Q13
plt.figure(figsize=(10, 6))
sns.barplot(x='Type', y='Price', data=google_playstore_data, estimator=np.mean)
plt.title('Average Prices by App Type')
plt.show()

# Q14
top_10_apps = google_playstore_data[['App', 'Installs']].sort_values(by='Installs', ascending=False).head(10)
print("Top 10 most popular apps:\n", top_10_apps)


```




```python
#15
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

# Calculate the total installs per category
category_installs = google_playstore_data.groupby('Category')['Installs'].sum().sort_values(ascending=False)

# Plot the most popular app categories
plt.figure(figsize=(12, 8))
sns.barplot(x=category_installs.values, y=category_installs.index, palette='viridis')
plt.title('Most Popular App Categories by Installs')
plt.xlabel('Total Installs (Billions)')
plt.ylabel('App Category')
plt.show()
```

