<a href="https://colab.research.google.com/github/havaledar/test/blob/main/ECON3740_Lab_6.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Matplotlib

## Basics of matplotlib and its plotting functions
Matplotlib is a comprehensive library for creating static, animated, and interactive visualizations in Python. It's highly customizable and works well with numpy and pandas data structures, making it a popular choice for data visualization in scientific computing.

## Libraries
First, let's import the matplotlib library along with numpy for some basic data manipulation:

In [None]:
import numpy as np
import pandas as pd

import matplotlib.pyplot as plt
from scipy.stats import gaussian_kde

## Type of plots

### Line Plot
A basic plot to display data points linearly connected by straight line segments.

In [None]:
x = np.linspace(0, 10, 100)
y = np.sin(x)

In [None]:
plt.plot(x, y)

plt.show()

### Scatter Plot
Used to plot points on a horizontal and vertical axis to show how much one variable is affected by another.

In [None]:
# x = np.random.rand(100)
# y = np.random.rand(100)

In [None]:
plt.scatter(x, y)

plt.show()

### Bar Chart
A bar chart represents data with rectangular bars with lengths proportional to the values they represent.

In [None]:
categories = ['A', 'B', 'C', 'D']
values = [25, 35, 15, 20]

In [None]:
plt.bar(categories, values)

plt.show()

### Pie Chart
A pie chart is a circular statistical graphic divided into slices to illustrate numerical proportion.

In [None]:
# sizes = [25, 35, 20, 20]
# labels = ['Category A', 'Category B', 'Category C', 'Category D']

In [None]:
plt.pie(values, labels=categories)

plt.show()

###Histogram
A histogram represents the distribution of data by forming bins along the range of the data and then drawing bars to show the number of observations that fall in each bin.

https://matplotlib.org/stable/api/_as_gen/matplotlib.pyplot.hist.html

In [None]:
data = np.random.randn(1000)

In [None]:
plt.hist(data)

plt.show()

In [None]:
plt.hist(data, density=True)

plt.show()

In [None]:
plt.hist(data, density=True)

kde = gaussian_kde(data)
x = np.linspace(min(data), max(data), 1000)
plt.plot(x, kde(x), color='orange')

plt.show()

### Box Plot
A box plot, or box-and-whisker plot, shows the distribution of quantitative data in a way that facilitates comparisons between variables or across levels of a categorical variable.

In [None]:
data = np.random.rand(10, 4)

In [None]:
plt.boxplot(data)

plt.show()

### Heatmap
A heatmap contains values represented as colors and is used for visualizing the correlation matrix, among other uses.

In [None]:
data = np.random.rand(10, 10)

In [None]:
plt.imshow(data, cmap='hot', interpolation='nearest')

plt.show()

## Residuals

In [None]:
%%capture
!pip install wooldridge

import wooldridge as woo
import statsmodels.formula.api as smf

In [None]:
data = woo.data('ceosal2')

model = smf.ols('salary ~ ceoten', data)

results = model.fit()

print(results.summary())

In [None]:
results.resid

## CEOSAL plot
Now, we'll create a simple line plot.

In [None]:
%%capture

!pip install wooldridge

import wooldridge as woo

data = woo.data('ceosal2')

In [None]:
x = data['ceoten']
y = data['salary']

plt.scatter(x, y)
plt.show()

## Customizing plots
Matplotlib plots are highly customizable. You can change the labels, legends, colors, and markers to make your plots more informative and visually appealing.

### Figure size

In [None]:
# Create a figure
plt.figure(figsize=(10, 8))

# Create x and y variables
x = data['ceoten']
y = data['salary']

# Create a line plot
plt.scatter(x, y)
# Display the plot
plt.show()

### Adding labels and title

In [None]:
plt.scatter(x, y)

# Adding a title
plt.title('Salary in $1000s vs. Number of years in the Company as CEO')

# Adding the x and y labels
plt.xlabel('Years')
plt.ylabel('Salary')

plt.show()


### Color

In [None]:
plt.scatter(x,
            y,
            color='green')

# Adding a title
plt.title('Salary in $1000s vs. Number of years in the Company as CEO')

# Adding the x and y labels
plt.xlabel('Years')
plt.ylabel('Salary')

plt.show()

### Marker

In [None]:
plt.scatter(x,
            y,
            color='green',
            marker='*')

# Adding a title
plt.title('Salary in $1000s vs. Number of years in the Company as CEO')

# Adding the x and y labels
plt.xlabel('Years')
plt.ylabel('Salary')

plt.show()

### Size

In [None]:
plt.scatter(x,
            y,
            color='green',
            marker='*',
            s=100)

# Adding a title
plt.title('Salary in $1000s vs. Number of years in the Company as CEO')

# Adding the x and y labels
plt.xlabel('Years')
plt.ylabel('Salary')

plt.show()

### Transparency

In [None]:
plt.scatter(x,
            y,
            color='green',
            marker='*',
            s=100,
            alpha=0.1)

# Adding a title
plt.title('Salary in $1000s vs. Number of years in the Company as CEO')

# Adding the x and y labels
plt.xlabel('Years')
plt.ylabel('Salary')

plt.show()

### Border

In [None]:
plt.scatter(x,
            y,
            color='green',
            marker='*',
            s=100,
            alpha=0.5,
            edgecolors='black',
            linewidths=1)

# Adding a title
plt.title('Salary in $1000s vs. Number of years in the Company as CEO')

# Adding the x and y labels
plt.xlabel('Years')
plt.ylabel('Salary')

plt.show()

### Adding a legend

In [None]:
plt.scatter(x,
            y,
            color='green',
            marker='*',
            s=100,
            alpha=0.5,
            edgecolors='black',
            linewidths=0.5,
            label='Salary')

# Adding a title
plt.title('Salary in $1000s vs. Number of years in the Company as CEO')

# Adding the x and y labels
plt.xlabel('Years')
plt.ylabel('Salary')

# Adding the legend
plt.legend()

plt.show()

## Multiple plots

In [None]:
z = data['mktval']

plt.scatter(x, y, label='Salary')

# Adding a second plot
plt.scatter(x, z, label='Market Value', color='red', alpha=0.5)

plt.xlabel('Year')
plt.ylabel('Amount')
plt.title('Salary and Market Value vs. Year')
plt.legend()

plt.show()


## Subplot

In [None]:
# Create first subplot
plt.subplot(2, 1, 1)
plt.scatter(x, y, label='Salary')
plt.xlabel('Year')
plt.ylabel('Amount')
plt.title('Salary vs. Year')
plt.legend()

# Create second subplot
plt.subplot(2, 1, 2)
plt.scatter(x, z, label='Market Value', color='red')
plt.xlabel('Year')
plt.ylabel('Amount')
plt.title('Market Value vs. Year')
plt.legend()

# Adjust layout
plt.tight_layout()

plt.show()
