In [None]:
Q1: What is Matplotlib? Why is it used? Name five plots that can be plotted using the Pyplot module of
Matplotlib.

Matplotlib is a comprehensive library for creating static, animated, and interactive visualizations in Python. It is widely used for generating high-quality plots, charts, and graphs to visualize data in various formats.

Matplotlib is used for various purposes, including:

1. **Data Visualization:** Matplotlib enables users to create a wide range of plots to visualize data effectively, helping in data exploration, analysis, and communication.

2. **Publication-Quality Graphics:** It provides extensive customization options to create publication-quality graphics for research papers, presentations, and reports.

3. **Interactivity:** Matplotlib supports interactive plotting, allowing users to explore data dynamically, zoom in/out, pan across plots, and interact with plot elements.

4. **Integration with NumPy:** Matplotlib seamlessly integrates with NumPy, making it easy to plot NumPy arrays and manipulate data directly.

5. **Support for Various Plot Types:** Matplotlib's Pyplot module offers functions to create various types of plots, including line plots, scatter plots, bar plots, histograms, pie charts, and more.

Five plots that can be plotted using the Pyplot module of Matplotlib include:

1. **Line Plot:** Used to visualize the relationship between two variables by plotting data points connected by straight lines.

2. **Scatter Plot:** Displays individual data points as markers, allowing users to observe the relationship between two variables.

3. **Bar Plot:** Represents categorical data with rectangular bars, where the length of each bar corresponds to the value of the data.

4. **Histogram:** Shows the distribution of numerical data by dividing the data into bins and plotting the frequency or probability density of each bin.

5. **Pie Chart:** Displays the proportion of each category in a dataset as a circular chart, where each category is represented by a slice of the pie.

These are just a few examples, and Matplotlib offers many more types of plots to cater to various visualization needs.

In [None]:
Q2: What is a scatter plot? Use the following code to generate data for x and y. Using this generated data
plot a scatter plot.
import numpy as np
np.random.seed(3)
x = 3 + np.random.normal(0, 2, 50)
y = 3 + np.random.normal(0, 2, len(x))
Note: Also add title, xlabel, and ylabel to the plot.

A scatter plot is a type of plot used to display the relationship between two variables by representing individual data points as markers on a two-dimensional plane. Each point on the plot corresponds to a pair of values from the two variables being plotted, with one variable typically plotted on the x-axis and the other on the y-axis. Scatter plots are useful for visualizing patterns, trends, and correlations in data.

Here's how you can generate the data for x and y using the given code and plot a scatter plot using Matplotlib:

import numpy as np
import matplotlib.pyplot as plt

# Generate data for x and y
np.random.seed(3)
x = 3 + np.random.normal(0, 2, 50)
y = 3 + np.random.normal(0, 2, len(x))

# Plot the scatter plot
plt.figure(figsize=(8, 6))
plt.scatter(x, y, color='blue', alpha=0.7)  # 'alpha' parameter controls transparency
plt.title('Scatter Plot')
plt.xlabel('X-axis')
plt.ylabel('Y-axis')
plt.grid(True)  # Add grid lines for better visualization
plt.show()

This code will generate a scatter plot using the generated data for x and y. The `scatter()` function from Matplotlib's Pyplot module is used to create the scatter plot, and `title()`, `xlabel()`, and `ylabel()` functions are used to add title, x-axis label, and y-axis label to the plot respectively. The `grid()` function is used to add grid lines to the plot for better visualization.

In [None]:
Q3: Why is the subplot() function used? Draw four line plots using the subplot() function.
Use the following data:
import numpy as np
For line 1: x = np.array([0, 1, 2, 3, 4, 5]) and y = np.array([0, 100, 200, 300, 400, 500])
For line 2: x = np.array([0, 1, 2, 3, 4, 5]) and y = np.array([50, 20, 40, 20, 60, 70])
For line 3: x = np.array([0, 1, 2, 3, 4, 5]) and y = np.array([10, 20, 30, 40, 50, 60])
For line 4: x = np.array([0, 1, 2, 3, 4, 5]) and y = np.array([200, 350, 250, 550, 450, 150])

The `subplot()` function in Matplotlib is used to create multiple plots (subplots) within a single figure. It allows you to arrange plots in a grid of rows and columns, specifying the position of each subplot using row and column indices.

Here's how you can use the `subplot()` function to draw four line plots:

import numpy as np
import matplotlib.pyplot as plt

# Data for line 1
x1 = np.array([0, 1, 2, 3, 4, 5])
y1 = np.array([0, 100, 200, 300, 400, 500])

# Data for line 2
x2 = np.array([0, 1, 2, 3, 4, 5])
y2 = np.array([50, 20, 40, 20, 60, 70])

# Data for line 3
x3 = np.array([0, 1, 2, 3, 4, 5])
y3 = np.array([10, 20, 30, 40, 50, 60])

# Data for line 4
x4 = np.array([0, 1, 2, 3, 4, 5])
y4 = np.array([200, 350, 250, 550, 450, 150])

# Create subplots
plt.figure(figsize=(10, 8))

# Subplot 1
plt.subplot(2, 2, 1)
plt.plot(x1, y1, color='blue')
plt.title('Line Plot 1')

# Subplot 2
plt.subplot(2, 2, 2)
plt.plot(x2, y2, color='red')
plt.title('Line Plot 2')

# Subplot 3
plt.subplot(2, 2, 3)
plt.plot(x3, y3, color='green')
plt.title('Line Plot 3')

# Subplot 4
plt.subplot(2, 2, 4)
plt.plot(x4, y4, color='orange')
plt.title('Line Plot 4')

plt.tight_layout()
plt.show()

This code will create four line plots arranged in a 2x2 grid using the `subplot()` function. Each subplot will display one of the provided datasets as a line plot.

In [None]:
Q4: What is a bar plot? Why is it used? Using the following data plot a bar plot and a horizontal bar plot.
import numpy as np
company = np.array(["Apple", "Microsoft", "Google", "AMD"])
profit = np.array([3000, 8000, 1000, 10000])

A bar plot is a type of plot that represents categorical data with rectangular bars. The length of each bar corresponds to the value of the data it represents. Bar plots are commonly used to compare and visualize the values of different categories or groups.

Bar plots are used for the following purposes:

1. **Comparison:** Bar plots are effective for comparing the values of different categories or groups visually.
  
2. **Trend Analysis:** They can be used to analyze trends or patterns across different categories or groups over time or other dimensions.

3. **Presentation:** Bar plots are often used in presentations, reports, and publications to present categorical data in a clear and easy-to-understand manner.

Now, let's create a bar plot and a horizontal bar plot using the provided data:

import numpy as np
import matplotlib.pyplot as plt

# Data
company = np.array(["Apple", "Microsoft", "Google", "AMD"])
profit = np.array([3000, 8000, 1000, 10000])

# Bar plot
plt.figure(figsize=(8, 6))
plt.bar(company, profit, color='skyblue')
plt.title('Bar Plot of Company Profits')
plt.xlabel('Company')
plt.ylabel('Profit ($)')
plt.xticks(rotation=45)  # Rotate x-axis labels for better readability
plt.grid(axis='y')  # Add horizontal grid lines
plt.tight_layout()
plt.show()

# Horizontal bar plot
plt.figure(figsize=(8, 6))
plt.barh(company, profit, color='lightgreen')
plt.title('Horizontal Bar Plot of Company Profits')
plt.xlabel('Profit ($)')
plt.ylabel('Company')
plt.grid(axis='x')  # Add vertical grid lines
plt.tight_layout()
plt.show()

This code will generate a bar plot and a horizontal bar plot of company profits using the provided data. The bar heights represent the profit values for each company. The `bar()` function is used for vertical bar plots, and the `barh()` function is used for horizontal bar plots.

In [None]:
Q5: What is a box plot? Why is it used? Using the following data plot a box plot.
box1 = np.random.normal(100, 10, 200)
box2 = np.random.normal(90, 20, 200)

A box plot, also known as a box-and-whisker plot, is a graphical representation of the distribution of a dataset based on five summary statistics: minimum, first quartile (Q1), median (Q2), third quartile (Q3), and maximum. It provides a concise summary of the distribution's central tendency, spread, and skewness.

A box plot is used for the following purposes:

1. **Identifying Outliers:** Box plots visually highlight potential outliers in the dataset, which are observations that fall significantly outside the range of the majority of the data.

2. **Comparing Distributions:** Box plots allow for easy comparison of the distributions of multiple datasets or groups, providing insights into differences in central tendency, variability, and skewness.

3. **Visualizing Spread and Variability:** The length of the box in a box plot represents the interquartile range (IQR), which provides a measure of spread or variability in the data.

4. **Detecting Skewness:** Box plots can reveal the skewness of the dataset based on the relative positions of the median and quartiles.

Now, let's create a box plot using the provided data:

import numpy as np
import matplotlib.pyplot as plt

# Generate data
box1 = np.random.normal(100, 10, 200)
box2 = np.random.normal(90, 20, 200)

# Combine data into a list
data = [box1, box2]

# Create box plot
plt.figure(figsize=(8, 6))
plt.boxplot(data, patch_artist=True, notch=True, vert=True)
plt.title('Box Plot of Two Datasets')
plt.xlabel('Dataset')
plt.ylabel('Values')
plt.xticks([1, 2], ['Box 1', 'Box 2'])
plt.grid(axis='y')  # Add horizontal grid lines
plt.tight_layout()
plt.show()

This code will generate a box plot comparing the distributions of two datasets (`box1` and `box2`). The box plot provides information about the central tendency, spread, and distribution shape of each dataset, allowing for visual comparison between the two distributions.