# Week-1 Assignment

*Welcome to your first assignment for the SimuTech Winter Project 2022! I hope you are excited to implement and test everything you have learned up until now. There is an interesting set of questions for you to refine your acquired skills as you delve into hands-on coding and deepen your understanding of numpy, pandas, and data visualization libraries.*

# Section 0 : Importing Libraries

*Let's begin by importing numpy, pandas and matplotlib.*

In [1]:
#your code here
import pandas as pd
import numpy as np
from matplotlib import pyplot as plt


# Section 1 : Playing with Python and Numpy

### Q1. Matrix Multiplication

##### (i) Check if matrix multiplication is valid

In [None]:
def isValid(A,B):
  #your code here
  if np.shape(A)[1]==np.shape(B)[0]:
        return 1
  else:
        return 0

##### (ii) Using loops (without using numpy)

In [None]:
def matrix_multiply(A,B):
  #your code 
  r1=np.shape(A)[0]
  c1=np.shape(A)[1]
  r2=np.shape(B)[0]
  c2=np.shape(B)[1]

  #Creating a list with all values set to 0
  result = [[0 for x in range(c2)] for y in range(r1)]

  #Using loops
  for i in range(r1):
        for j in range(c2):
            for k in range(c1):
                result[i][j] = result[i][j]+ A[i][k] * B[k][j]

  return result

##### (iii) Using numpy

In [None]:
def matrix_multiply_2(A,B):
  #your code here
    return A@B

##### (iv) Testing your code

Run the following cell to check if your functions are working properly.

*Expected output:*
[ [102 108 114]
 [246 261 276]
 [390 414 438]
 [534 567 600] ]

In [None]:
A = np.array([
    [1, 2, 3],
    [4, 5, 6],
    [7, 8, 9],
    [10, 11, 12]
])

B = np.array([
    [13, 14, 15],
    [16, 17, 18],
    [19, 20, 21]
])


if isValid(A,B):
  print(f"Result using loops: {matrix_multiply(A,B)}")
  print(f"Result using numpy: {matrix_multiply_2(A,B)}")
else:
  print(f"Matrix multiplication is not valid")

### Q2. Z-Score Normalisation

Z-score normalization refers to the process of normalizing every value in a dataset such that the mean of all of the values is 0 and the standard deviation is 1.

We use the following formula to perform a z-score normalization on every value in a dataset:

New value = (x – μ) / σ

where:

x: Original value

μ: Mean of data

σ: Standard deviation of data

##### (i) Without using numpy

In [2]:
def mean(x):
  #your code here
    sum=0
    for i in x:
        sum=sum+i
    den=len(x)
    return (sum/den)

In [3]:
def standard_deviation(x): 
  #your code here
  sum=0
  for i in x:
    sum=sum+((i-mean(x))**2)

  variance = sum / len(x)
  
  SD=variance**0.5
  
  return SD

In [None]:
def zscore_normalisation(x):
  #your code here
   
    normalized_data = [(i - mean(x)) / standard_deviation(x) for i in x]
    return normalized_data
        
    

##### (ii) Using numpy

Numpy has in_built functions for calculating mean and standard deviation

In [None]:
def zscore_normalisation_2(x):
    #your code here
    mean1=np.mean(x)
    std1=np.std(x)

    normalized_data2 = [(i - mean1) / std1 for i in x]
    
    return normalized_data2
  

##### (iii) Testing your code

Run the following cell to check if your functions are working properly.

*Expected Output:* [-1.06753267 -0.99745394 -0.99745394 -0.81057732 -0.41346451 -0.06307086
  0.31068237  0.91803138  1.22170588  1.89913361]

In [None]:
x = [4, 7, 7, 15, 32, 47, 63, 89, 102, 131]
print(f"Result without using numpy: {zscore_normalisation(x)}")
print(f"Result using numpy: {zscore_normalisation_2(x)}")

### Q3. Sigmoid fn and its derivative

##### (i) Write a fn to implement sigmoid fn

In [4]:
def sigmoidfn(x):
  #your code here
  e=2.718281828459045
  return 1/(1+(e**(-x)))

The sigmoid function is a mathematical function that maps any input value to a value between 0 and 1.

It is defined mathematically as s(x) = 1/(1+e^(-x)).

##### (ii) Write a fn to implement derivative of sigmoid fn

In [5]:
def derivative(x):
  #your code here
  e=2.718281828459045
  num=(e**(-x))
  den=(1+(e**(-x)))**2
  return num/den


##### (iii) Test your code

Run the following cell to check if your functions are working properly.

*Expected output:*

x on applying sigmoid activation fn is: [ [0.99987661 0.88079708 0.99330715 0.5        0.5       ]
 [0.99908895 0.99330715 0.5        0.5        0.5       ] ]

x on applying derivative of sigmoid activation fn is: [ [-1.23379350e-04 -1.04993585e-01 -6.64805667e-03 -2.50000000e-01
  -2.50000000e-01]
 [-9.10221180e-04 -6.64805667e-03 -2.50000000e-01 -2.50000000e-01
  -2.50000000e-01] ]

In [6]:
x = np.array([
    [9,2,5,0,0],
    [7,5,0,0,0]
])
print(f"x on applying sigmoid activation fn is: {sigmoidfn(x)}")
print(f"x on applying derivative of sigmoid activation fn is: {derivative(x)}")

x on applying sigmoid activation fn is: [[0.99987661 0.88079708 0.99330715 0.5        0.5       ]
 [0.99908895 0.99330715 0.5        0.5        0.5       ]]
x on applying derivative of sigmoid activation fn is: [[-1.23379350e-04 -1.04993585e-01 -6.64805667e-03 -2.50000000e-01
  -2.50000000e-01]
 [-9.10221180e-04 -6.64805667e-03 -2.50000000e-01 -2.50000000e-01
  -2.50000000e-01]]


# Section 2: Exploring Pandas

*You have been provided with a dataset which includes information about properties of superheated vapor.*

*The dataset consists of the thermophysical properties: specific volume, specific internal energy, specific enthalpy, specific entropy of superheated vapor.*

*Pressure is in kPa and Temperature in centigrade. In the dataframe 75, 100, 125, etc. are temperatures.*

### Read the csv file


In [None]:
#your code here
x=pd.read_csv("superheated_vapor_properties.csv")
df=pd.DataFrame(x)
df.head()


### Display the shape of data frame


In [None]:
#your code here
Shape=df.shape
print("Shape of Dataframe is:",Shape)




### Return an array containing names of all the columns

In [None]:
#your code here
Col_name=df.columns
array=Col_name.to_numpy()
print("Given is the array of column names:\n",array)
type(array)

### Display the number of null values in each column of the dataframe



In [None]:
#your code here
df.isnull().sum(axis=0)
 

### Create a column which contains the Pressure and Property columns, seperated with 'at' (For eg. V at 1, H at 101.325). Using this print the following:
- Enthalpy at 75 kPa and 573 K
- Entropy at 493 K and 250 kPa



In [None]:
#your code here

#I am using string concatenation here and hence converting columns into 'str' type
df['Combined'] = df['Property'].astype(str) + ' at ' + df['Pressure'].astype(str)

#Considering the given temperatures are in degree celsius 

row1=df.loc[(df["Combined"]=="V at 75.0")].index #Using the Combined column
col1=df.columns.get_loc("300")
print("Enthalpy at 75 kPa and 573 K is: ",df.iloc[row1,col1].values)

row2=df.loc[(df["Combined"]=="S at 250.0")].index #Using the Combined column
col2=df.columns.get_loc("220")
print("Entropy at 493K and 250 kPa is: ",df.iloc[row2,col2].values)

#The column 220Deg celsius is not present therefore Nan value


### Find out the column with the highest number of missing values

In [None]:
#your code here
print("The column at the top has the maximum null values: \n")
df.isnull().sum().sort_values(ascending=False)


### What is the average enthalpy of Sat. Liq. at all different pressures in the dataset?

In [None]:
#your code here
grouped = df.groupby(df.Property)
H_data = grouped.get_group("H")
print("The average enthalp of Sat. Liq. at all different pressures is: ")
H_data["Liq_Sat"].mean()




### Separate out the V,U,H,S data from the dataset into V_data, U_data, H_data, S_data


In [None]:
#your code here
grouped = df.groupby(df.Property)
V_data = grouped.get_group("V")
print ("V_data is: \n",V_data)

grouped = df.groupby(df.Property)
U_data = grouped.get_group("U")
print ("U_data is: \n",U_data)

grouped = df.groupby(df.Property)
H_data = grouped.get_group("H")
print ("H_data is: \n",H_data)

grouped = df.groupby(df.Property)
S_data = grouped.get_group("S")
print ("S_data is: \n",S_data)

# Section 3: Plotting with Matplotlib

### Plot the properties (specific volume, specific internal energy, specific enthalpy, specific entropy) vs Pressure for saturated liquid.

Note:
- Try using the subplot feature of matplotlib(Explore it!!)
- Provide appropriate title, labels, markersize and other parameters to the plot

In [None]:
#your code here

fig, axes = plt.subplots(2, 2)
fig.suptitle("V,U,H,S v/s P graph for Sat_Liq ")
fig.supxlabel('Pressure')
fig.supylabel('Properties')
axes[0, 0].plot(V_data["Pressure"],V_data["Liq_Sat"],color='r',marker='*') 
axes[0, 0].set_title("V v/s P")

axes[0, 1].plot(U_data["Pressure"],U_data["Liq_Sat"], marker='>' ) 
axes[0, 1].set_title("U v/s P")

axes[1, 0].plot(H_data["Pressure"],H_data["Liq_Sat"],color='y')
axes[1, 0].set_title("H v/s P")

axes[1, 1].plot(S_data["Pressure"],S_data["Liq_Sat"], color='g')
axes[1, 1].set_title("S v/s P")






#V_data["Liq_Sat"].isnull().sum() 

#No Null Values in Pressure,Liq_Sat and Vap_Sat therefore we plot directly



### Plot the same for saturated vapor.

In [None]:
#your code here


fig, axes = plt.subplots(2, 2)

fig.suptitle("V,U,H,S v/s P graph for Sat_Vap ")
fig.supxlabel('Pressure')
fig.supylabel('Properties')

axes[0, 0].plot(V_data["Pressure"],V_data["Vap_Sat"],color='r',marker='*') 
axes[0, 0].set_title("V v/s P")

axes[0, 1].plot(U_data["Pressure"],U_data["Vap_Sat"], marker='>' ) 
axes[0, 1].set_title("U v/s P")

axes[1, 0].plot(H_data["Pressure"],H_data["Vap_Sat"],color='y')
axes[1, 0].set_title("H v/s P")

axes[1, 1].plot(S_data["Pressure"],S_data["Vap_Sat"], color='g')
axes[1, 1].set_title("S v/s P")









### Plot the specific volume of saturated liquid between 300 kPa and 1500 kPa

In [None]:
#your code here

vplot=V_data.loc[(V_data['Pressure'] > 300) & (V_data['Pressure'] < 1500)]
plt.plot(vplot["Pressure"],vplot["Liq_Sat"],color='m',marker='+')
plt.xlabel("Pressure (in kPa)",color='r')
plt.ylabel("Specific Volume",color='r')
plt.title("Specific Volume v/s Pressure graph")


# Section 4 : Conclusion

*Congratulations on reaching this point! I hope you had fun solving your first assignment and have also built confidence in applying these libraries. If you are wondering, we will cover more about z-score normalization in Week 2, and the sigmoid function will be used in Week 3. After completing this assignment, you are now prepared to learn about machine learning techniques and implement your own machine learning models.*