# Week-1 Assignment

*Welcome to your first assignment for the SimuTech Winter Project 2022! I hope you are excited to implement and test everything you have learned up until now. There is an interesting set of questions for you to refine your acquired skills as you delve into hands-on coding and deepen your understanding of numpy, pandas, and data visualization libraries.*

# Section 0 : Importing Libraries

*Let's begin by importing numpy, pandas and matplotlib.*

In [None]:
#your code here
import numpy as np 
import pandas as pd 
from matplotlib import pyplot as plt


# Section 1 : Playing with Python and Numpy

### Q1. Matrix Multiplication

##### (i) Check if matrix multiplication is valid

In [None]:
def isValid(A,B):
  #your code here
    t1 = np.shape(A)
    r1,c1=t1
    t2 = np.shape(B)
    r2,c2=t2
    if c1==r2:
        return True 
    else: 
        return False

##### (ii) Using loops (without using numpy)

In [None]:
def matrix_multiply(A,B):
  #your code here
    r = [[0,0,0],[0,0,0],[0,0,0],[0,0,0]]
    for i in range(len(A)):
        for j in range(len(B[0])):
            for k in range(len(B)):
                r[i][j] += A[i][k]*B[k][j]
    return r

##### (iii) Using numpy

In [None]:
def matrix_multiply_2(A,B):
  #your code here
    c= np.dot(A,B)
    return c

##### (iv) Testing your code

Run the following cell to check if your functions are working properly.

*Expected output:*
[ [102 108 114]
 [246 261 276]
 [390 414 438]
 [534 567 600] ]

In [None]:
A = np.array([
    [1, 2, 3],
    [4, 5, 6],
    [7, 8, 9],
    [10, 11, 12]
])

B = np.array([
    [13, 14, 15],
    [16, 17, 18],
    [19, 20, 21]
])

if isValid(A,B):
  print(f"Result using loops: {matrix_multiply(A,B)}")
  print(f"Result using numpy: {matrix_multiply_2(A,B)}")
else:
  print(f"Matrix multiplication is not valid")

### Q2. Z-Score Normalisation

Z-score normalization refers to the process of normalizing every value in a dataset such that the mean of all of the values is 0 and the standard deviation is 1.

We use the following formula to perform a z-score normalization on every value in a dataset:

New value = (x – μ) / σ

where:

x: Original value

μ: Mean of data

σ: Standard deviation of data

##### (i) Without using numpy

In [None]:
def mean(x):
  #your code here
    m = sum(i for i in x)/len(x)
    return m

In [None]:
def standard_deviation(x):
  #your code here
    m1= mean(x)
    var = sum((v1-m1)**2 for v1 in x)/len(x)
    std = var**0.5
    return std

In [None]:
def zscore_normalisation(x):
  #your code here
    m2= mean(x)
    s2= standard_deviation(x)
    z1= [(val-m2)/s2 for val in x]
    return z1

##### (ii) Using numpy

Numpy has in_built functions for calculating mean and standard deviation

In [None]:
def zscore_normalisation_2(x):
  #your code here
    m3= np.mean(x)
    s3= np.std(x)
    z2= [(val2 - m3)/s3 for val2 in x]
    return z2

##### (iii) Testing your code

Run the following cell to check if your functions are working properly.

*Expected Output:* [-1.06753267 -0.99745394 -0.99745394 -0.81057732 -0.41346451 -0.06307086
  0.31068237  0.91803138  1.22170588  1.89913361]

In [None]:
x = [4, 7, 7, 15, 32, 47, 63, 89, 102, 131]
print(f"Result without using numpy: {zscore_normalisation(x)}")
print(f"Result using numpy: {zscore_normalisation_2(x)}")

### Q3. Sigmoid fn and its derivative

The sigmoid function is a mathematical function that maps any input value to a value between 0 and 1.

It is defined mathematically as s(x) = 1/(1+e^(-x)).

##### (i) Write a fn to implement sigmoid fn

In [None]:
def sigmoidfn(x):
  #your code here
    v1 = [(1/(1+np.exp(-s))) for s in x]
    return v1

##### (ii) Write a fn to implement derivative of sigmoid fn

In [None]:
def derivative(x):
  #your code here
    v = sigmoidfn(x)
    ds = [(val*(1-val)) for val in v]
    return ds

##### (iii) Test your code

Run the following cell to check if your functions are working properly.

*Expected output:*

x on applying sigmoid activation fn is: [ [0.99987661 0.88079708 0.99330715 0.5        0.5       ]
 [0.99908895 0.99330715 0.5        0.5        0.5       ] ]

x on applying derivative of sigmoid activation fn is: [ [-1.23379350e-04 -1.04993585e-01 -6.64805667e-03 -2.50000000e-01
  -2.50000000e-01]
 [-9.10221180e-04 -6.64805667e-03 -2.50000000e-01 -2.50000000e-01
  -2.50000000e-01] ]

In [None]:
x = np.array([
    [9,2,5,0,0],
    [7,5,0,0,0]
])
print(f"x on applying sigmoid activation fn is: {sigmoidfn(x)}")
print(f"x on applying derivative of sigmoid activation fn is: {derivative(x)}")

# Section 2: Exploring Pandas

*You have been provided with a dataset which includes information about properties of superheated vapor.*

*The dataset consists of the thermophysical properties: specific volume, specific internal energy, specific enthalpy, specific entropy of superheated vapor.*

*Pressure is in kPa and Temperature in centigrade. In the dataframe 75, 100, 125, etc. are temperatures.*

### Read the csv file


In [None]:
#your code here
import pandas as pd
df = pd.read_csv('superheated_vapor_properties_solution.csv')

### Display the shape of data frame


In [None]:
#your code here
s = df.shape
print(s)

### Return an array containing names of all the columns

In [None]:
#your code here
column = list(df.columns)
print(column)

### Display the number of null values in each column of the dataframe



In [None]:
#your code here
null_count = df.isna().sum()
print(null_count)


### Create a column which contains the Pressure and Property columns, seperated with 'at' (For eg. V at 1, H at 101.325). Using this print the following:
- Enthalpy at 75 kPa and 573 K
- Entropy at 493 K and 250 kPa



In [None]:
#your code here
df["Property at Pressure"] = df['Property'] + " at " + df['Pressure'].astype(str)
s = df[df['Property at Pressure']=='H at 75.0']
print("enthalpy at 75kPa and 573K is", s['300'])
t = df[df['Property at Pressure']=='S at 250.0']
print("entropy at 250kPa and 493K is", t['220'])

### Find out the column with the highest number of missing values

In [None]:
#your code here
print("column with most missing values is", df.count().idxmin())

### What is the average enthalpy of Sat. Liq. at all different pressures in the dataset?

In [None]:
#your code here
a = df.groupby('Property')['Liq_Sat'].mean()
print('Average values of all properties at all pressures for Saturated Liquid is')
print(a)

### Separate out the V,U,H,S data from the dataset into V_data, U_data, H_data, S_data

In [None]:
#your code here
s= df[df['Property']=='S']
v= df[df['Property']=='V']
h= df[df['Property']=='H']
u= df[df['Property']=='U']
print("V_data")
print(v)
print("U_data")
print(u)
print("H_data")
print(h)
print("S_data")
print(s)

v.to_csv("c:/Users/Mews system/OneDrive/Desktop/CHE Project/Machine-Learning-with-Python/230372_DhyanaviChauhan/Assignment1/test.csv")
u.to_csv("c:/Users/Mews system/OneDrive/Desktop/CHE Project/Machine-Learning-with-Python/230372_DhyanaviChauhan/Assignment1/test.csv", mode='a')
h.to_csv("c:/Users/Mews system/OneDrive/Desktop/CHE Project/Machine-Learning-with-Python/230372_DhyanaviChauhan/Assignment1/test.csv", mode='a')
s.to_csv("c:/Users/Mews system/OneDrive/Desktop/CHE Project/Machine-Learning-with-Python/230372_DhyanaviChauhan/Assignment1/test.csv", mode='a')




# Section 3: Plotting with Matplotlib

### Plot the properties (specific volume, specific internal energy, specific enthalpy, specific entropy) vs Pressure for saturated liquid.

Note:
- Try using the subplot feature of matplotlib(Explore it!!)
- Provide appropriate title, labels, markersize and other parameters to the plot

In [None]:
#your code here
import pandas as pd
from matplotlib import pyplot as plt
df = pd.read_csv('superheated_vapor_properties_solution.csv')

s= df[df['Property']=='S']
v= df[df['Property']=='V']
h= df[df['Property']=='H']
u= df[df['Property']=='U']

s.plot(x="Pressure", y="Liq_Sat")
plt.title("entropy vs pressure for sat. liq")
plt.ylabel("entropy")
plt.xlabel("pressure")

v.plot(x="Pressure", y="Liq_Sat")
plt.title("volume vs pressure for sat. liq")
plt.xlabel("pressure")
plt.ylabel("volume")

h.plot(x="Pressure", y= "Liq_Sat")
plt.title("enthalpy vs pressure for sat. liq")
plt.xlabel("Pressure")
plt.ylabel("enthalpy")


u.plot(x="Pressure", y ="Liq_Sat")
plt.title("internal energy vs pressure for sat. liq")
plt.xlabel("pressure")
plt.ylabel("internal energy")
plt.show

### Plot the same for saturated vapor.

In [None]:
#your code here

import pandas as pd
from matplotlib import pyplot as plt
df = pd.read_csv('superheated_vapor_properties_solution.csv')

s= df[df['Property']=='S']
v= df[df['Property']=='V']
h= df[df['Property']=='H']
u= df[df['Property']=='U']

s.plot(x="Pressure", y="Vap_Sat")
plt.title("entropy vs pressure for sat. vap")
plt.ylabel("entropy")
plt.xlabel("pressure")

v.plot(x="Pressure", y="Vap_Sat")
plt.title("volume vs pressure for sat. vap")
plt.xlabel("pressure")
plt.ylabel("volume")

h.plot(x="Pressure", y= "Vap_Sat")
plt.title("enthalpy vs pressure for sat. vap")
plt.xlabel("Pressure")
plt.ylabel("enthalpy")


u.plot(x="Pressure", y ="Vap_Sat")
plt.title("internal energy vs pressure for sat. vap")
plt.xlabel("pressure")
plt.ylabel("internal energy")
plt.show

### Plot the specific volume of saturated liquid between 300 kPa and 1500 kPa

In [None]:
#your code here
v= df[df['Property']=='V']
v.plot(x="Pressure", y="Liq_Sat", xlim=(300,1500))
plt.title("volume vs pressure for sat. lid b/w 300kpa and 1500kpa")
plt.xlabel('pressure')
plt.ylabel('volume')
plt.show()

# Section 4 : Conclusion

*Congratulations on reaching this point! I hope you had fun solving your first assignment and have also built confidence in applying these libraries. If you are wondering, we will cover more about z-score normalization in Week 2, and the sigmoid function will be used in Week 3. After completing this assignment, you are now prepared to learn about machine learning techniques and implement your own machine learning models.*