# Plotting Stress-strain curves in reproducible way
This tutorial will show how to plot 'publication' quality stress strain curves using python. 
The datasets are in form of excel files, which we assume to be the standard way in which the stress strain curve data is stored. 
The tutorial will cover basics of python programming language.
Moreover, this notebook can be modified to handle large amounts of datafiles of similar format to plot multiple curves at once and have a reproducible plotting style

# Why use python for plotting?

- Flexible, any special plot feature can be created
- Free and open source
- calculations are faster and other options for data analysis
- reproducibility
- Automate repetitive steps


*with **jupterhub server** python is easier to use than ever*

## Modules in python
Python has a huge library of codes. 
These different types of codes can be bound together into modules.
When we want to use these set of codes for certain purpose, we **import** them.
Once you import some module, you can use the functions/classes etc present in it.

To enable use of access, we also rename the modules to the names we want. 
Following modules are used in this notebook:
1. [pandas](https://pandas.pydata.org/)            - For Data analysis and manipulation 
2. [numpy](https://numpy.org/)                     - Package for numerical/scientific computing with lots of array based calculations 
3. [math](https://docs.python.org/3/library/math.html) - Module for mathematical functions
3. [matplotlib](https://matplotlib.org/)           - The most widely used plotting library based on python 
4. [os](https://docs.python.org/3/library/os.html) - Library for Operating system related functionality

In [None]:
import pandas as pd
import numpy as np
import math
import os
import matplotlib as mpl
import matplotlib.pyplot as PyPlot
%matplotlib inline

### Reading in excel file
Based on my excel file, I know that the relevant data is from row 5 (in python it is row 4 as counting starts from 0).

And I use only columns A to L.

In the function below, we first give the name of the excel file.

Then we specify, from which row in the excel file is used to label our data. 

As row 4 (start counting at 0!!) has the column names, we say header is 4.


This is how the file looks here now in form of a **DataFrame**

In [None]:
df

### Accessing data within a dataframe
The data in the columns of a dataframe can be accessed by calling out the name of the column as shown below

In [None]:
df['True Strain']

### Plotting

In [None]:
PyPlot.plot(df['True Strain'],df['True Stress'])

## Looping over files
But as we want to use the power that python provides, we would like to do everything at once. 
***Loop*** give us this power - Looping is basically repeating a set of commands number of times

Also we need to know something about the datatypes in python: 
**Lists** and **Dictionaries**


### List example
This is an example of list. It is denoted by square brackets **[ ]**

It is used to store iterable (looping possibilities) items to be called by a single variable.

In [None]:
# empty list
example_list = []

# list of integers
example_list = [1, 2, 3]

In [None]:
# Call list items
example_list[0]

In the example below, the list has multiple file names which we can loop over (iterable).

In [None]:
os.listdir('sample_data/')

### Dictionary example
Dictionary is similar to our 'normal' dictionaries. They are denoted by curly brackets **{ }**. 

Dictionaries can store any kind of data/data structure as values.

We have key and value pairs.

In [None]:
example_dict = {'a':1, 'b': 2}
example_dict = {'a':'Apple', 'b': 'Banana'}
example_dict = {'a':[1,2,3], 'b': [4,5,6]}

In [None]:
# what does one specific key correspond to?
example_dict['a']

So we will create a dictionary which will reference to the dataframes like we saw before

We will create a empty dictionary and then add entries to it

In [None]:
data_dict = {}
for i in os.listdir('sample_data/'):


In [None]:
data_dict.keys()

In [None]:
data_dict['Data_1.xlsx']

#### Finding stress and strain limits in the whole dataset

In [None]:
max_stress_lim = 0.0
max_strain_lim = 0.0
for expt_data in data_dict.values():
    max_y_lim = #finding maximum stress for a given file
    max_x_lim = #finding maximum strain for a given file
    if max_y_lim > max_stress_lim:
        max_stress_lim = max_y_lim
    if max_x_lim > max_strain_lim:
        max_strain_lim = max_x_lim

In [None]:
max_strain_lim

In [None]:
max_stress_lim

Rounding off values

In [None]:
max_strain_lim = round(max_strain_lim * 2.0) / 2.0
max_stress_lim = (round(max_stress_lim/50.0)+2.0)*50.0

In [None]:
max_stress_lim,max_strain_lim

## Plot settings

Create a simple plot by looping

In [None]:
fig, ax = PyPlot.subplots()

for datas in data_dict.values():
    PyPlot.plot(datas['True Strain'],
                datas['True Stress']
                )

This is a very simple plot
Not yet publication ready.

Generally we change following things:
1. Colors
2. Linestyles
3. Linewidths
4. Labels
5. Axes labels
6. Axes limits
7. Axes ticks (optional)

In [None]:
set_colors = ['black','red','teal','fuchsia','blue'] # https://matplotlib.org/gallery/color/named_colors.html

In [None]:
linestyle_list = ['solid', 'dashed', 'dashdot', 'dotted']
label_list = [1073,1173,1273,1373]

In [None]:
fig, ax = PyPlot.subplots()
ax.tick_params(which='both',direction="in",top=True,right=True)
ax.minorticks_on()

#for count,i in enumerate(data_dict.keys()):
for datas,line,colors,labels in zip(data_dict.values(),linestyle_list,set_colors,label_list):
    PyPlot.plot(datas['True Strain'],
                datas['True Stress'],
                color=colors,
                linewidth = 2.5,
                linestyle = line,
                label = labels
                )
    PyPlot.xlim(-0.025,max_strain_lim)
    PyPlot.ylim(0.0,max_stress_lim)
    PyPlot.xlabel(r'$\epsilon$(-)',fontsize=18)   # latex based symbols
    PyPlot.ylabel(r'$\sigma$ (MPa)',fontsize=18)
    PyPlot.legend(title=r'$ T (\mathrm{K})$')#,title_fontsize ='large')
        
fig.savefig('multi_stress_strain.png',dpi=600)

### Cutting off data

Can be done automatically but easiest way is manually.

We can look at where we want to cut off and give that value

In [None]:
cut_off_strains = []
data_dict_2 = {}
data_dict_2['Data_1.xlsx'] = data_dict['Data_1.xlsx'].loc[data_dict['Data_1.xlsx']['True Strain'] < 0.88]

In [None]:
PyPlot.plot(data_dict_2['Data_1.xlsx']['True Strain'],data_dict_2['Data_1.xlsx']['True Stress'])

## Automating cut off

Idea: See where the derivative drops significantly

For automation might need smooth data

Need simpler ideas for python newbies. I dont want to make a complicated code. 

In [None]:
from scipy.signal import savgol_filter

In [None]:
fig,ax1 = PyPlot.subplots()
ax1.plot(data_dict['Data_4.xlsx']['True Strain'],savgol_filter(data_dict['Data_4.xlsx']['True Stress'],31,9,deriv=1))
ax2 = ax1.twinx()
ax2.plot(data_dict['Data_4.xlsx']['True Strain'],data_dict['Data_4.xlsx']['True Stress'])

In [None]:
index_min = np.argmin(savgol_filter(data_dict['Data_1.xlsx']['True Stress'],71,9,deriv=1))
data_dict['Data_1.xlsx']['True Strain'][index_min]

In [None]:
math.floor(0.8953057035808059*50.0)/50.0

In [None]:
data_dict['Data_4.xlsx'].loc[data_dict['Data_4.xlsx']['True Strain'] < 0.88]

In [None]:
new_data_dict = {}
for datas,name in zip(data_dict.values(),data_dict.keys()):
    index_min = np.argmin(savgol_filter(datas['True Stress'],31,9,deriv=1))
    cut_off_value = datas['True Strain'][index_min]
    rounded_cut_off = math.floor(cut_off_value*50.0)/50.0
    new_data_dict[name] = datas.loc[datas['True Strain'] < rounded_cut_off]

In [None]:
new_data_dict['Data_1.xlsx']

In [None]:
fig, ax = PyPlot.subplots()
ax.tick_params(which='both',direction="in",top=True,right=True)
ax.minorticks_on()

#for count,i in enumerate(data_dict.keys()):
for datas,line,colors,labels in zip(new_data_dict.values(),linestyle_list,set_colors,label_list):
    PyPlot.plot(datas['True Strain'],
                datas['True Stress'],
                color=colors,
                linewidth = 2.5,
                linestyle = line,
                label = labels
                )
    PyPlot.xlim(-0.025,max_strain_lim)
    PyPlot.ylim(0.0,max_stress_lim)
    PyPlot.xlabel(r'$\epsilon$(-)',fontsize=18)   # latex based symbols
    PyPlot.ylabel(r'$\sigma$ (MPa)',fontsize=18)
    PyPlot.legend(title=r'$ T (\mathrm{K})$')#,title_fontsize ='large')
        
fig.savefig('multi_stress_strain_cut_off.png',dpi=600)