# Installing Modules with Pip

## Introduction
In Python, modules are additional pieces of code that can be integrated into your project to provide specific functionalities. Pip is a package manager for Python that makes it easy to install and manage these modules.

## Installing a Module
To install a module using pip, you can use the following command in your terminal or command prompt:

```bash
pip install module_name
```

# Introduction to Numpy

## What is Numpy?
Numpy is a powerful numerical library in Python that provides support for large, multi-dimensional arrays and matrices, along with mathematical functions to operate on these arrays.

## Installation
To install Numpy using pip, run the following command:

```bash
pip install numpy
```

# Importing the necessary libraries

In [3]:
# Importing the necessary libraries
 # %pip install numpy
import numpy as np

# Basic Numpy Functions for Data Scientists


### Creating arrays

In [4]:
# Creating arrays
arr1 = np.array([1, 2, 3, 4, 5])
arr2 = np.arange(0, 10, 2)  # Using arange for a range of values
arr3 = np.linspace(0, 1,5)  # Using linspace for evenly spaced values

print("arr1 values: ", arr1)
print("arr2 values: ", arr2)
print("arr3 values ", arr3)

arr1 values:  [1 2 3 4 5]
arr2 values:  [0 2 4 6 8]
arr3 values  [0.   0.25 0.5  0.75 1.  ]


### Array operations


In [5]:
# Array operations
sum_arr = np.sum(arr1)
mean_arr = np.mean(arr2)
std_dev_arr = np.std(arr3)
print("Sum:", sum_arr)
print("Mean:", mean_arr)
print("Standard Deviation:", std_dev_arr)

Sum: 15
Mean: 4.0
Standard Deviation: 0.3535533905932738


### Reshaping and stacking arrays

In [8]:
# Creating a 2D array
arr_2d = np.array([[1, 2, 3], [4, 5, 6]])

# Reshaping the array to a 2x3 matrix
reshaped_arr = arr_2d.reshape((2, 3))
print("reshaped_arr shape:", reshaped_arr.shape)

# Stacking arrays vertically
stacked_arr = np.vstack((arr_2d, reshaped_arr))

# Displaying the arrays
print("Original Array:")
print(arr_2d)
print("Reshaped Array:")
print(reshaped_arr)
print("Stacked Array:")
print(stacked_arr)

reshaped_arr shape: (2, 3)
Original Array:
[[1 2 3]
 [4 5 6]]
Reshaped Array:
[[1 2 3]
 [4 5 6]]
Stacked Array:
[[1 2 3]
 [4 5 6]
 [1 2 3]
 [4 5 6]]


### Indexing and slicing


In [9]:
## Indexing and slicing
print(arr1)
arr_slice = arr1[2:]
print("Sliced Array:", arr_slice)

[1 2 3 4 5]
Sliced Array: [3 4 5]


# Advanced Numpy Functions for Data Scientists


### Broadcasting and element-wise operations


In [10]:
# Broadcasting and element-wise operations
broadcasted_arr = np.arange(3)[:, np.newaxis] + np.arange(3)
elementwise_mult = np.multiply(arr1, arr2)

print("Broadcasted Array:")
print(broadcasted_arr)
print("Element-wise Multiplication:")
print(arr1)
print(arr2)
print(elementwise_mult)

Broadcasted Array:
[[0 1 2]
 [1 2 3]
 [2 3 4]]
Element-wise Multiplication:
[1 2 3 4 5]
[0 2 4 6 8]
[ 0  4 12 24 40]


### Numpy statistical functions


In [11]:
# Numpy statistical functions
percentile_arr = np.percentile(arr1, [25, 50, 75])
correlation_coefficient = np.corrcoef(arr1, arr2)
print("Percentiles:", percentile_arr)
print("Correlation Coefficient:")
print(correlation_coefficient)

Percentiles: [2. 3. 4.]
Correlation Coefficient:
[[1. 1.]
 [1. 1.]]


### Random numbers and sampling


In [12]:
# Random numbers and sampling
random_normal = np.random.normal(0, 1, size=(3, 3))
random_integers = np.random.randint(1, 15, size=(1, 4))
print("Random Normal Distribution:")
print(random_normal)
print("Random Integers:")
print(random_integers)

Random Normal Distribution:
[[ 1.05099859  0.41762794 -0.34673106]
 [ 0.7611953  -1.28828368  1.38223109]
 [-1.38287101  0.331656   -0.26751375]]
Random Integers:
[[ 8 14  1 12]]


### Linear algebra operations



In [13]:
# Linear algebra operations
matrix_a = np.array([[1, 2], [3, 4]])
matrix_b = np.array([[5, 6], [7, 8]])
matrix_inverse = np.linalg.inv(matrix_a)
eigenvalues, eigenvectors = np.linalg.eig(matrix_a)
print("Matrix Inverse:")
print(matrix_inverse)
print("Eigenvalues:")
print(eigenvalues)
print("Eigenvectors:")
print(eigenvectors)

Matrix Inverse:
[[-2.   1. ]
 [ 1.5 -0.5]]
Eigenvalues:
[-0.37228132  5.37228132]
Eigenvectors:
[[-0.82456484 -0.41597356]
 [ 0.56576746 -0.90937671]]


### Saving and loading arrays




In [14]:
## Saving and loading arrays
np.save('saved_array.npy', arr3)
loaded_arr = np.load('saved_array.npy')
print("Loaded Array:", loaded_arr)

Loaded Array: [0.   0.25 0.5  0.75 1.  ]


In [15]:
# Let's create some sample data for monthly sales of three products over a year.
monthly_sales_data = np.random.randint(50, 200, size=(12, 3))
print("Sample data for monthly sales:\n ", monthly_sales_data)

Sample data for monthly sales:
  [[152 123 153]
 [ 69 103 118]
 [164 197  87]
 [ 88  71 114]
 [ 66 148  64]
 [124 156 179]
 [165 116  87]
 [ 97 127  51]
 [154  50 143]
 [ 53  53 152]
 [154  63 179]
 [170 156  59]]


In [16]:
# Calculate total sales for each product over the year
total_sales = np.sum(monthly_sales_data, axis=0)
print("Total Sales for Each Product:", total_sales)

Total Sales for Each Product: [1456 1363 1386]


In [15]:
# Find the month with the highest total sales
best_month = np.argmax(np.sum(monthly_sales_data, axis=1)) + 1
print("Best Sales Month:", best_month)

Best Sales Month: 7


# Scenario: Financial Analysis for a Retail Store
Imagine you are working with a retail store that sells electronic gadgets. The store maintains **monthly sales data** for three types of products: *smartphones*, *laptops*, and *smartwatches*. Your task is to ***perform a financial analysis using NumPy to derive meaningful insights***.


In [18]:
# Sample monthly sales data (in thousands of dollars)
monthly_sales_data = np.array([
    [120, 80, 50],
    [130, 90, 55],
    [110, 85, 60],
    [140, 75, 45],
    [150, 95, 65],
    [130, 80, 55],
    [110, 75, 50],
    [100, 70, 40],
    [120, 85, 60],
    [140, 100, 70],
    [160, 110, 80],
    [180, 120, 90]
])

# Products: 0 - Smartphones, 1 - Laptops, 2 - Smartwatches
product_labels = ['Smartphones', 'Laptops', 'Smartwatches']


### Tasks:
1. ***Total Sales Analysis:***
   - Calculate the total sales for each product over the entire year.
    - Identify the product with the highest total sales.
3. ***Monthly Performance Analysis:***
   - Find the month with the highest sales for each product.
    - Determine the overall best-selling month.
5. ***Average Monthly Sales:***
   - Calculate the average monthly sales for each product.
   - Find the product with the highest average monthly sales.
7. ***Yearly Performance Summary:***
   - Summarize the yearly performance by calculating the total sales, average monthly sales, and the best-selling product.


### 1. Total Sales Analysis

In [19]:
array = np.array([1,2,5,7,0,9])
highest_value = np.argmax(array)
print(array[highest_value])

9


In [20]:
# 1. Total Sales Analysis
total_sales = np.sum(monthly_sales_data, axis=0)
best_selling_product = np.argmax(total_sales)
print("Total Sales for Each Product:", total_sales)
print("Best Selling Product:", product_labels[best_selling_product])

Total Sales for Each Product: [1590 1065  720]
Best Selling Product: Smartphones


### 2. Monthly Performance Analysis

In [21]:
# 2. Monthly Performance Analysis
best_month_each_product = np.argmax(monthly_sales_data, axis=0) + 1
overall_best_month = np.argmax(np.sum(monthly_sales_data, axis=1)) + 1
print(np.sum(monthly_sales_data, axis=1))
print("Best Sales Month for Each Product:", best_month_each_product)
print("Overall Best Sales Month:", overall_best_month)

[250 275 255 260 310 265 235 210 265 310 350 390]
Best Sales Month for Each Product: [12 12 12]
Overall Best Sales Month: 12


### 3. Average Monthly Sales

In [22]:
# 3. Average Monthly Sales
average_monthly_sales = np.mean(monthly_sales_data, axis=0)
best_selling_product_avg = np.argmax(average_monthly_sales)
print("Average Monthly Sales for Each Product:", average_monthly_sales)
print("Product with the Highest Average Monthly Sales:", product_labels[best_selling_product_avg])

Average Monthly Sales for Each Product: [132.5   88.75  60.  ]
Product with the Highest Average Monthly Sales: Smartphones


### 4. Yearly Performance Summary

In [23]:
# 4. Yearly Performance Summary
yearly_summary = {
    'Total Sales': total_sales,
    'Average Monthly Sales': average_monthly_sales,
    'Best Selling Product': product_labels[best_selling_product]
}
print("\nYearly Performance Summary:")
for key, value in yearly_summary.items():
    print(f"{key}: {value}")


Yearly Performance Summary:
Total Sales: [1590 1065  720]
Average Monthly Sales: [132.5   88.75  60.  ]
Best Selling Product: Smartphones


### Exercise:
Imagine the retail store is planning to introduce a new product category, ***"Tablets,"*** in the coming year. Assume that the monthly sales data for tablets is estimated to be:


In [24]:
# Estimated monthly sales data for Tablets (in thousands of dollars)
tablet_sales_data = np.array([100, 70, 40, 120, 80, 50, 110, 85, 60, 130, 90, 55])

### Tasks:
1. ***Integration of Tablets:***
    - Integrate the tablet sales data into the existing monthly_sales_data array.
    - Update the product_labels accordingly.
3. ***Re-analysis with Tables:***
    - Recalculate the total sales for each product, considering the tablets.
    - Determine the product with the highest total sales.
5. ***Impact Analysis:***
   - Analyze the impact of adding tablets on the overall best-selling month.
   - Compare the average monthly sales for each product before and after adding tablets.




### 1. Integration of Tablets

In [25]:
# 1. Integration of Tablets
monthly_sales_data = np.column_stack((monthly_sales_data, tablet_sales_data))
# product_labels.append('Tablets')
print("The new monthly sales data: \n", monthly_sales_data)
print("The new product labels: \n", product_labels)

The new monthly sales data: 
 [[120  80  50 100]
 [130  90  55  70]
 [110  85  60  40]
 [140  75  45 120]
 [150  95  65  80]
 [130  80  55  50]
 [110  75  50 110]
 [100  70  40  85]
 [120  85  60  60]
 [140 100  70 130]
 [160 110  80  90]
 [180 120  90  55]]
The new product labels: 
 ['Smartphones', 'Laptops', 'Smartwatches']


### 2. Re-analysis with Tablets


In [26]:
# 2. Re-analysis with Tablets
total_sales_with_tablets = np.sum(monthly_sales_data, axis=0)
best_selling_product_with_tablets = np.argmax(total_sales_with_tablets)
print("\nTotal Sales for Each Product (with Tablets):", total_sales_with_tablets)
print("Best Selling Product (with Tablets):", product_labels[best_selling_product_with_tablets])


Total Sales for Each Product (with Tablets): [1590 1065  720  990]
Best Selling Product (with Tablets): Smartphones


### 3. Impact Analysis


In [27]:
# 3. Impact Analysis
best_month_each_product_with_tablets = np.argmax(monthly_sales_data, axis=0) + 1
overall_best_month_with_tablets = np.argmax(np.sum(monthly_sales_data, axis=1)) + 1
print("\nBest Sales Month for Each Product (with Tablets):", best_month_each_product_with_tablets)
print("Overall Best Sales Month (with Tablets):", overall_best_month_with_tablets)


Best Sales Month for Each Product (with Tablets): [12 12 12 10]
Overall Best Sales Month (with Tablets): 12


### 4. Compare Average Monthly Sales


In [28]:
# 4. Compare Average Monthly Sales
average_monthly_sales_with_tablets = np.mean(monthly_sales_data, axis=0)
best_selling_product_avg_with_tablets = np.argmax(average_monthly_sales_with_tablets)
print("\nAverage Monthly Sales for Each Product (with Tablets):", average_monthly_sales_with_tablets)
print("Product with the Highest Average Monthly Sales (with Tablets):", product_labels[best_selling_product_avg_with_tablets])


Average Monthly Sales for Each Product (with Tablets): [132.5   88.75  60.    82.5 ]
Product with the Highest Average Monthly Sales (with Tablets): Smartphones
