# **Linear Combination**

### You will work with a simple data set that contains factory production values of electrical machinery parts for a corporation. you will perform the following tasks:
- Load and study the data
- Extract production values as vectors from the data
- Extract number of working days as scalars from the data
- Use scalar multiplication to scale production values by number of working days
- Code a linear combination of vectors using addition and scalar multiplication


## Task 1 - Load and study the data

In [1]:
# Load "numpy" and "pandas" for manipulating numbers, vectors and data frames

import numpy as np
import pandas as pd

In [8]:
# Read in the "Factory_Production.csv" file as a Pandas Data Frame
# Note: Make sure the code and the data are in the same folder or specify the appropriate path for the data
df =pd.read_csv('/content/Factory_Production.csv')
df.set_index('Factory',inplace=True)

In [9]:
# Take a brief look at the data frame using ".head()"

df.head()

Unnamed: 0_level_0,Generators,Motors,Cables,Days
Factory,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
A,8,17,232,352
B,6,20,203,348
C,10,16,187,337
D,7,24,218,357
E,13,13,256,362


##### Feature Description:
This data set contains the production values of power units such as generators, motors and cables of different factories belonging to the same corporation.
- Factory = unique label assigned to a factory unit for the purpose of identification
- Generators = number of generators produced by the factory in a day
- Motors = number of motors produced by the factory in a day
- Cables = number of cables produced by the factory in a day
- Days = number of working days of the factory in a year

##### study its features such as:
- The number of factory units
- The number of machine parts
- The ranges of production values

In [11]:
# Check the dimensions of the data frame using ".shape"

df.shape


(10, 4)

In [12]:
# View basic information about the data frame using ".info()"

df.info()

<class 'pandas.core.frame.DataFrame'>
Index: 10 entries, A to J
Data columns (total 4 columns):
 #   Column      Non-Null Count  Dtype
---  ------      --------------  -----
 0   Generators  10 non-null     int64
 1   Motors      10 non-null     int64
 2   Cables      10 non-null     int64
 3   Days        10 non-null     int64
dtypes: int64(4)
memory usage: 700.0+ bytes


In [13]:
# View a statistical summary of the data frame using ".describe()"
# Note: Use ".transpose()" to make the summary easier to read

df.describe().transpose()

Unnamed: 0,count,mean,std,min,25%,50%,75%,max
Generators,10.0,8.8,2.65832,5.0,7.0,8.5,10.75,13.0
Motors,10.0,16.4,4.926121,8.0,13.5,16.5,19.5,24.0
Cables,10.0,205.1,29.01513,156.0,187.0,206.5,221.75,256.0
Days,10.0,344.6,15.889899,316.0,337.5,347.0,355.75,365.0


#### Observations

- There are 10 rows and 4 columns
- The first 3 columns contain the production values of generators, motors and cables
- The fourth column contains the number of working days of the factories

## Task 2 - Extract production values as vectors from the data


- The factory production values for each factory are contained in the "Generators", "Motors" and "Cables" features.
- Separating these features from the "Days" feature could be useful.

In [14]:
# Drop the "Days" column from the data frame using ".drop()" and save the data in a new data frame called "df_prod"
df_prod =df.drop('Days',axis=1)


In [15]:
# Take a brief look at the data frame "df_prod" using ".head()"
df_prod.head()


Unnamed: 0_level_0,Generators,Motors,Cables
Factory,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
A,8,17,232
B,6,20,203
C,10,16,187
D,7,24,218
E,13,13,256


In [16]:
# Access the vector of production values for factory "D" using ".loc[]" from the data frame "df_prod"

df_prod.loc['D']

Unnamed: 0,D
Generators,7
Motors,24
Cables,218


#### Observations

- The vector of production values for any factory is 3-long
- These vectors contain the production values of generators, motors and cables per day

## Task 3 - Extract number of working days as scalars from the data



- The number of working days for each factory is stored in the "Days" column of the data frame "df".
- These will be the scalars that we will use to multiply with the production value vectors.

In [17]:
# Access the "Days" column from the data frame "df"

df_days=df["Days"]
df_days


Unnamed: 0_level_0,Days
Factory,Unnamed: 1_level_1
A,352
B,348
C,337
D,357
E,362
F,365
G,339
H,324
I,316
J,346


In [19]:
# Access the number of working days of factory "G" from the Pandas Series obtained by accessing the "Days" column
# Note: You may use ".loc[]" to access these values as well

fac_gdays=df_days.loc['G']
print(fac_gdays)

339


#### Observations

- The "Days" column contains the number of working days in a year of each factory.
- These will be the scalars that we will use to multiply with the production value vectors.

## Task 4 - Use scalar multiplication to scale production values by number of working days


- The production values for each factory are available in the data frame "df_prod".
- The number of working days in a year for each factory can be extracted from the data frame "df".
- The product of the vector of production values and the number of working days is a scaled version of the original vector.

![](https://miro.medium.com/max/1400/1*gf3HdrkDBi6Dch_XxVuYvA.png)

## **`Watch Video 5 : Matrix Multiplication`**

In [21]:
# Access the vector of production values for factory "B" using ".loc[]" from the data frame "df_prod"

df_prod.loc['B']

Unnamed: 0,B
Generators,6
Motors,20
Cables,203


In [23]:
# Access the number of working days of factory "B"

print(df_days.loc['B'])


348


In [24]:
# Multiply the vector of production values for factory "B" by its number of working days

print(df_prod.loc['B']*df_days.loc['B'])

Generators     2088
Motors         6960
Cables        70644
Name: B, dtype: int64


#### Observations

- The product of the vector of production values and the number of working days is a scaled version of the original vector.
- The scaled vector of production values for a certain factory shows the number of machine parts produced in a full year.

## Task 5 - Code a linear combination of vectors using addition and scalar multiplication

- The vector of daily production values for a certain factory can be accessed from the data frame "df_prod".
- The number of working days in a year for each factory can be extracted from the data frame "df".
- The total production of a certain factory in a full year can be obtained using scalar multiplication.

![](https://www2.seas.gwu.edu/~simhaweb/lin/modules/module3/figures/matrixvec2.png)

In [33]:
# Calculate the total production values of all the factories using their production value vectors and number of working days
# Add these vectors to obtain a single vector that contains the yearly production values of the whole corporation
# Note: Create a Numpy List called "total" with a size of 3 and fill it with zeroes
# Note: You may use the "total" list to cumulatively add the yearly production values for each factory

total = np.zeros(3) # list of zeros of size 3

for factory in df.index:
    total += df_prod.loc[factory] * df_days.loc[factory]

In [35]:
# Print the "total" series

total

Unnamed: 0,A
Generators,30355.0
Motors,56602.0
Cables,708048.0


#### Observations

- The corporation produces 30355 generators in a year.
- The corporation produces 56602 motors in a year.
- The corporation produces 708048 cables in a year.

#### Conclusions

- From the factory production data, we can calculate the total production of the corporation
- We can treat the production values of each factory as vectors
- We can use the number of days as scalars to scale each factory's production values.
- We can use a linear combination of number of working days and production values to calculate the total production.

# **FEEDBACK**

https://forms.zohopublic.in/cloudyml/form/CloudyMLStatisticsFeedbackForm/formperma/WV946wnf0sDM_tOlH87RxZR9yMceKWGrtuPOyXzzCRc