# 2019 DA Project:  Supply Chain - Perishable Goods

## Project Plan

- Choose a real world phenomenon
- Research and understand the phenomenon
- Identify variables 
- Match variables to a distribution
- Synthesise the dataset
- Analyse variables and their inter relationships
- Devise an algorithm or method to synthesise those variables
- Generate dataset

## Introduction

About one third of the food produced in the world for human consumption every year — approximately 1.3 billion tonnes — gets lost or wasted. Fruit and vegetables, have the highest wastage rates of any food.
http://www.fao.org/save-food/resources/keyfindings/en/

With an increase in food demand in the world, reducing waste in each stage of the food chain is essential. Many cases in manufacturing operations can be effective in causing waste, most of which, according to Lemma et al., are inefficiency in production, storage and transportation. In addition, inappropriate planning and supply chain management practices are the main operational reasons for wastes in different countries [14]

Food waste is senseless and immoral given the hundreds of millions of people that do not have enough food to eat.  There are several contributors to the losses worldwide, unsuitable harvest timing, weather conditions, handling practices, retail losses and household waste. 

The value of perishable products changes significantly over time in the supply chain at rates that are often highly temperature and humidity dependent.  This change depends on how the product changes in value over the time interval between production and delivery to the customer. [12].     What is economical nor efficient operationally with tight margins, complex sourcing and international dimensions. It is a challenge to minimum waste.
This project will focus on losses that happen from storage facilities to transportation to the retailer.

## Research

### Apples

Apples once harvested have to overcome many obstacles during its journey to the customer and then may not even being eaten, as happens to nearly one third of food produced worldwide. 
https://www.climateforesight.eu/water-food/uneaten-apple-climate-change/

Harvesting apples
Apples are harvested shortly before ripening has begun. If the apples are harvested too early, however, this will have consequences for quality; the fruits are then small, hard and green with little red bloom and are thus susceptible to ripening deficiencies such as scald and speck. If the apples are harvested too late, they may become mealy, soft and very sensitive and the options for storage are then limited.

Storage conditions
Apples are mostly stored in cool stores under controlled atmosphere. They can be stored for up to 12 months with CA/ULO storage and remain in good condition at a temperature of between 0 and 5 ºC, depending on the variety. In order to keep moisture loss to a minimum, apples should be stored in 90-95% atmospheric humidity. Apples generally respond very well to a reduction of O2 and an increase of CO2 (Controlled atmosphere storage). Generally, chilled apples produce little ethylene as long as they do not begin to ripen. They are, however, very susceptible to ethylene, which will set the ripening process in motion. There may also be apples among the crop that are already ripening and are, therefore, producing ethylene; these can activate the ripening of their 'neighbours'. By storing them in very low oxygen conditions (ULO storage), ethylene production is also kept very low. The sensitivity of apples to ethylene is significantly lower under ULO conditions.

Optimum storage conditions for apples
Temperature 0 - 5 degrees
Humidity 90 - 95%
Shelf life 10 - 14 days

## Variables

Determine the variables.  5 variables have been identified: 

#### Variable 1 : Temperature
Appples should be stored between 0 to 5 degrees celcius. A normal distribution is appropriate.

#### Variable 2 : Humidity
Relative humidity shoud be 90 - 95 %.  A normal distribution is appropriate.

#### Variable 3:  Shelf Life
10 - 14 days.  A normal distribution is appropriate.
 
#### Variable 4:  Transportation time (in days)
This relates to the transportation time from the manufacturer to the distribution centres.  The time in days has been set at 1 - 2 days.  A normal distribution is therefore appropriate.

## Import Libriaries

In [1]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import csv
import string
from random import choices

In [2]:
# Make matplotlib show interactive plots in the notebook.
%matplotlib inline
# Apply the default seaborn settings
sns.set()

### Establish size of dataset

In [3]:
# Dataset comprises 1000 apples and is called apple
apple = 1000

In [4]:
# Give the products random batch numbers
# Adapted from https://www.geeksforgeeks.org/python-string-ascii_uppercase/ & https://codereview.stackexchange.com/questions/198182/generate-10-random-3-letter-strings
batch = ["".join(choices(string.ascii_uppercase, k=7)) for _ in range(apple)]

## Match Variables to Distributions

### Variables.

In [5]:
# Generate data for temperature, humidity & shelf life using np.random.normal and call it data
# Temperature is -1 to - 4 degrees, humidity is 92% to 95% & shelf life is 10 to 14 days
data = {"temperature": np.around(np.random.normal(2.5,2.5,apple),0),"humidity": np.around(np.random.normal(92.5,2.5,apple),0), 
"shelf_life": np.around(np.random.normal(12,2,apple),0)}
# Create a pandas dataframe from the data above and call it df
df = pd.DataFrame(data=data)
# output df
df

Unnamed: 0,temperature,humidity,shelf_life
0,4.0,93.0,15.0
1,2.0,92.0,13.0
2,6.0,93.0,13.0
3,3.0,91.0,8.0
4,-1.0,92.0,8.0
...,...,...,...
995,1.0,94.0,14.0
996,1.0,89.0,11.0
997,1.0,86.0,11.0
998,4.0,91.0,12.0


In [6]:
# Create a resueable function and include variables of temperature, humidity and shelf life
def get_label(temperature, humidity, shelf_life):
# Find out if temperature is out of range
  if temperature < 0 or temperature > 5:
# Return 1 if out of range
    return 1
# Find out if humidity is out of range
  elif humidity < 90 or humidity > 95:
# Return 1 if out of range
    return 1
# Find out if shelf life is out of range
  elif shelf_life < 10 or shelf_life > 14:
# Return 1 if out of range
    return 1
# Otherwise return 0
  return 0
# Add a variable and call it waste
df['waste'] = df.apply(lambda row: get_label(row['temperature'], 
                                             row['humidity'],
                                             row['shelf_life']), axis=1)
df

Unnamed: 0,temperature,humidity,shelf_life,waste
0,4.0,93.0,15.0,1
1,2.0,92.0,13.0,0
2,6.0,93.0,13.0,1
3,3.0,91.0,8.0,1
4,-1.0,92.0,8.0,1
...,...,...,...,...
995,1.0,94.0,14.0,0
996,1.0,89.0,11.0,1
997,1.0,86.0,11.0,1
998,4.0,91.0,12.0,0


Dataset complete.  Time to analyse

In [7]:
# Adapted from https://stackoverflow.com/questions/49609353/pandas-dataframe-to-csv-not-exporting-all-rows/53606044
df.to_csv("sc.csv", index=False, sep=',', mode='w')

In [8]:
sc = pd.read_csv('https://raw.githubusercontent.com/mhurley100/DA-Project-2019/master/sc.csv', sep=',')

In [9]:
sc.describe()

Unnamed: 0,Ship_Days,Mfg_Days,Lead_time,Mthly_Forecast,Safety_Stock,Actual Sales
count,200.0,200.0,200.0,200.0,200.0,200.0
mean,5.025,6.044,11.069,230.05,154.1335,256.125
std,0.23377,0.998976,1.052463,381.912007,255.881045,413.968549
min,4.0,2.0,7.0,10.0,6.7,15.0
25%,5.0,5.4,10.4,10.0,6.7,15.0
50%,5.0,6.1,11.15,10.0,6.7,15.0
75%,5.0,6.625,11.7,100.0,67.0,120.0
max,6.0,8.6,13.6,1000.0,670.0,1090.0


In [10]:
# sns.lmplot(x="temperature", y="humidity", data = sc)

## References
 - [1] Python Software Foundation. Welcome to python.org.   
https://www.python.org/
 - [2] GMIT. Quality assurance framework.   
https://www.gmit.ie/general/quality-assurance-framework
 - [3] Software Freedom Conservancy. Git.   
https://git-scm.com/
 - [4] Project Jupyter. Project jupyter.    
https://jupyter.org/
 - [5] NumPy developers. Numpy.    
http://www.numpy.org/
 - [6] Clear Spider             
https://www.clearspider.com/blog-reduce-inventory-shortages/
 - [7] University of New Brunswick, NB Canada Fredericton   
http://www2.unb.ca/~ddu/4690/Lecture_notes/Lec2.pdf
 - [8] Buildmedia     
https://buildmedia.readthedocs.org/media/pdf/supplychainpy/latest/supplychainpy.pdf
 - [9] Pynative             
https://pynative.com/python-random-choice/
 - [10] Scipy          
 https://docs.scipy.org/doc/numpy-1.15.0/reference/generated/numpy.random.normal.html
 - [11] Wikipedia             
https://en.wikipedia.org/wiki/ABC_analysis
 - [12] Researchgate
https://www.researchgate.net/publication/227520884_Supply_Chain_Strategies_for_Perishable_Products_The_Case_of_Fresh_Produce
 - [13] EC.europa
http://ec.europa.eu/environment/life/project/Projects/index.cfm?fuseaction=search.dspPage&n_proj_id=5007&docType=pdf
 - [14] link.springer
https://link.springer.com/article/10.1007/s40092-018-0287-1