# Introduction to NumPy, Matplotlib, Pandas

This script gives an short overview of NumPy, Matplotlib and Pandas. 
* NumPy gives access to a new datatype, "array" which is useful when doing calculations. Will be much used in AI courses.
* Matplotlib is used for plotting. In general the data plotted is an NumPy array or something that can be converted to a NumPy array such as list or pandas dataframe.
* Pandas is a library for handling tabular data. Intuitively, things you can do in Excel can be done in Pandas.
  If you don't know Excel you can learn it in this video: https://www.youtube.com/watch?v=4UMLFC1SoHM&list=PLgzaMbMPEHEx2aR9-EXfD6psvezSMcHJ6&index=1&t=15s
  See chapter 1-6 and chapter 8 which covers the basics. 

In [None]:
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd

# NumPy

In [None]:
A = np.arange(1, 16).reshape(3,5)
print(A)
print(A.ndim)
print(A.shape)
print(A.size)
print(A.dtype)

In [None]:
B = np.arange(1, 10).reshape(3, 3)
print(B)

In [None]:
print(np.min(B))
print(np.min(B, axis = 0))  # Taking the first column row-wise, then the second column and so on. 
print(np.min(B, axis = 1))  # Taking the first row column-wise, then the second row and so on.  
print() 

print(np.argmin(B))  # The output will be the index of the element in the form of a flattened array.
print(np.argmin(B, axis = 0))
print()

print(np.sum(B))
print(np.mean(B, axis = 1))
print(np.median(B, axis = 1))

# Matplotlib

Plotting functions expect numpy.array or objects that can be passed to numpy.asarray. Classes that are similar to arrays ('array-like') such as pandas data objects.

* Figure: This is the whole figure and can be seen as a canvas. 
* Axes: The part of the "canvas" that the plot is attached on. We call methods that do the plotting directly from the Axes which gives high flexibility in customizing our plots. 
* Axis: Sets the scale and limits and generate ticks (the marks on the Axis) and ticklabels (strings labeling the ticks).
* Artist: Basically everything visible on the figure is an artist, including figure, axes and axis. Most Artists are tied to an Axes.


In [None]:
x = np.linspace(-4, 4, 100)
# print(x)
y = x**2

fig, ax = plt.subplots()
ax.plot(x, y)
ax.set_title('y = x^2')
ax.set_xlabel('x')
ax.set_ylabel('y')

In [None]:
np.random.seed(15)
random_data_x = np.random.randn(1000)
random_data_y = np.random.randn(1000)
x = np.linspace(-2, 2, 100)
y = x**2

fruit_data = {'grapes': 22, 'apple': 8, 'orange': 15, 'lemon': 20, 'lime': 25}
names = list(fruit_data.keys())
values = list(fruit_data.values())

# Creating Subplots.
fig, axs = plt.subplots(1, 2,layout='constrained')
fig.suptitle('Different Plots', size = 30)

axs[0].scatter(random_data_x, random_data_y)
axs[0].set_title('Scatter Plot')

axs[1].bar(names, values)
axs[1].set_title('Bar Plot')

# Pandas

In [None]:
cars = pd.read_csv("cars_data.csv")

In [None]:
cars.head(10)

In [None]:
cars.info()

In [None]:
cars.dropna(how = 'any', inplace = True)

In [None]:
cars.mean(numeric_only = True)

In [None]:
avg_price = cars.groupby("company")["price"].mean().sort_values(ascending=False)

# Plot
avg_price.plot(kind='bar', figsize=(10, 6), title='Average Car Price by Company', color='skyblue')
plt.ylabel('Average Price')
plt.tight_layout()
plt.show()

In [None]:
# Scatter plot
cars.plot(kind='scatter', x='horsepower', y='price', title='Horsepower vs. Price', color='green', figsize=(8,6))
plt.grid(True)
plt.tight_layout()
plt.show()
