# Introduction to NumPy, Pandas, and Matplotlib

In this notebook, we'll introduce three essential libraries for data science and scientific computing in Python:

- **NumPy:** A core package for numerical computations with efficient array operations.
- **Pandas:** A powerful library for data manipulation and analysis using DataFrames.
- **Matplotlib:** A versatile library for creating static, animated, and interactive visualizations.

Let's get started by exploring each library with some basic examples.

## Getting Started

Before you begin, make sure you have the necessary libraries installed. You can install them using pip:

```bash
pip install numpy pandas matplotlib
```

Now, let's dive into the basics of each library.

## 1. NumPy Basics

NumPy is the fundamental package for numerical computing in Python. It provides high-performance multidimensional arrays and a host of mathematical functions to operate on these arrays. In the following examples, we'll create arrays, perform vectorized operations, and explore array slicing.

In [None]:
import numpy as np

# Create a one-dimensional array
arr = np.array([1, 2, 3, 4, 5])
print("One-dimensional array:", arr)

# Create a two-dimensional array (matrix)
matrix = np.array([[1, 2, 3], [4, 5, 6]])
print("\nTwo-dimensional array (matrix):\n", matrix)

### Array Operations and Slicing

NumPy arrays support efficient vectorized operations and flexible slicing. Here are some examples:

In [None]:
# Vectorized operation: add 10 to each element of the array
new_arr = arr + 10
print("Array after adding 10:", new_arr)

# Slicing: extract elements from index 1 to 3 (inclusive of start, exclusive of end)
slice_arr = arr[1:4]
print("Sliced array (indexes 1 to 3):", slice_arr)

# Slicing with a step: every second element
step_slice = arr[::2]
print("Every second element:", step_slice)

### Basic Statistical Operations

NumPy offers many built-in functions for statistical analysis. Here are a few common examples:

In [None]:
print("Mean of array:", np.mean(arr))
print("Sum of array:", np.sum(arr))
print("Standard deviation of array:", np.std(arr))

## 2. Pandas Basics

Pandas is a powerful library for data manipulation and analysis. It introduces two key data structures: **Series** and **DataFrame**. In this section, we'll focus on DataFrames and explore how to create, view, and manipulate them.

In [None]:
import pandas as pd

# Create a DataFrame from a dictionary
data = {
    'Name': ['Alice', 'Bob', 'Charlie', 'David'],
    'Age': [25, 30, 35, 40],
    'City': ['New York', 'Los Angeles', 'Chicago', 'Houston']
}

df = pd.DataFrame(data)
print("DataFrame:\n", df)

# Display the first two rows of the DataFrame
print("\nFirst two rows:\n", df.head(2))

### Basic DataFrame Operations

Here are some common operations you can perform with Pandas DataFrames:

- **Descriptive Statistics:** Use `describe()` to get summary statistics of numerical columns.
- **Indexing and Filtering:** Access specific columns and filter rows based on conditions.
- **Sorting:** Sort the DataFrame based on column values.

In [None]:
# Descriptive statistics
print("\nDescriptive statistics:\n", df.describe())

# Selecting a single column
print("\nNames column:\n", df['Name'])

# Filtering: Select rows where Age is greater than 30
filtered_df = df[df['Age'] > 30]
print("\nRows where Age > 30:\n", filtered_df)

# Sorting the DataFrame by Age
sorted_df = df.sort_values(by='Age')
print("\nDataFrame sorted by Age:\n", sorted_df)

## 3. Matplotlib Basics

Matplotlib is one of the most widely used libraries for data visualization in Python. It allows you to create a wide range of static, animated, and interactive plots. In this section, we'll create a simple line plot and a scatter plot.

In [None]:
import matplotlib.pyplot as plt

# Create data for plotting using NumPy
x = np.linspace(0, 10, 100)  # 100 evenly spaced points between 0 and 10
y = np.sin(x)

# Create a line plot of the sine wave
plt.figure() 
plt.plot(x, y, label='Sine Wave')
plt.xlabel('X-axis')
plt.ylabel('Y-axis')
plt.title('Line Plot of Sine Wave')
plt.legend()
plt.show()

### Additional Plot: Scatter Plot

Let's create a scatter plot using some randomly generated data.

In [None]:
# Generate random data for scatter plot
np.random.seed(0)  # for reproducibility
x_scatter = np.random.rand(50)
y_scatter = np.random.rand(50)

plt.figure()
plt.scatter(x_scatter, y_scatter, color='green', label='Data Points')
plt.xlabel('X-axis')
plt.ylabel('Y-axis')
plt.title('Scatter Plot Example')
plt.legend()
plt.show()

## Wrap-Up

In this notebook, you learned the basics of three fundamental libraries for data science:

- **NumPy:** Creating arrays, performing vectorized operations, and slicing arrays.
- **Pandas:** Creating and manipulating DataFrames, computing descriptive statistics, and filtering data.
- **Matplotlib:** Creating simple visualizations such as line plots and scatter plots.

These libraries form the backbone of data analysis and visualization in Python. Experiment with the examples provided and explore further to enhance your skills.

Happy coding!

Author: [@gauranshkumar](https://gauransh.dev)