# **Optimizing the Flow of Materials through a Production Process using Data Analysis and Python Implementation**

## Import the necessary libraries

In [1]:
import pandas as pd

## Load the production data

In [2]:
# Load the data into a Pandas DataFrame
df = pd.read_csv('production_data.txt')

# inspect production data
df.head()

Unnamed: 0,machine_id,machine_type,production_time,product_type,production_volume,production_date,production_shift,machine_maintenance,raw_material_usage,quality_control
0,1,extruding machine A,120,textile grade,500 kg,2021-01-01,day shift,2021-01-01,chemicals A and B,yes
1,2,extruding machine B,100,textile grade,450 kg,2021-01-02,day shift,2021-01-02,chemicals A and C,yes
2,3,extruding machine C,110,industrial grade,400 kg,2021-01-03,day shift,2021-01-03,chemicals B and C,yes
3,4,extruding machine D,90,textile grade,350 kg,2021-01-04,day shift,2021-01-04,chemicals A and B,no
4,5,printing machine A,80,textile grade,300 kg,2021-01-05,day shift,2021-01-05,chemicals A and C,yes


The production data contains the following features:

* `machine_id:` The unique identifier for each machine.
* `machine_type:` The type of machine (e.g. "cutting machine", "extruding machine", etc.).
* `production_time:` The amount of time it took to produce the nylon on this machine.
* `product_type:` The type of nylon being produced (e.g. "textile grade", "industrial grade", etc.).
* `production_volume:` The amount of nylon produced during this production run.
* `production_date:` The date on which this production run took place.
* `production_shift:` The shift during which this production run took place (e.g. "day shift", "night shift").
* `machine_maintenance:` Information about when each machine was last serviced, as well as any maintenance issues that arose during production.
* `raw_material_usage:` Information about the raw materials used in this production run, such as the type and quantity of chemicals used.
* `quality_control:` Information about any quality control measures that were taken during production, such as testing for defects or measuring the dimensions of the finished products.

## Calculate the total production time for each machine

In [3]:
# Calculate the total production time for each machine
machine_production_time = df.groupby('machine_type')['production_time'].sum()

print(machine_production_time)

machine_type
cutting machine A       40
cutting machine B       30
cutting machine C       20
cutting machine D       10
extruding machine A    120
extruding machine B    100
extruding machine C    110
extruding machine D     90
printing machine A      80
printing machine B      70
printing machine C      60
printing machine D      50
punching machine A       5
punching machine B      10
punching machine C      15
punching machine D      15
recycling machine A      5
recycling machine B     10
recycling machine C     15
recycling machine D     25
Name: production_time, dtype: int64


## Find the machine with the highest production time

In [4]:
# Find the machine with the highest production time
most_productive_machine = machine_production_time.idxmax()

print(f"The most productive machine: {most_productive_machine}")

The most productive machine: extruding machine A


## Calculate the total production time for all extruding machine

In [5]:
# Calculate the total production time for all extruding machine
extruding_machine = df[df['machine_type'].str.contains('extruding machine')]

extruding_machine_production_time = extruding_machine.groupby('machine_type')['production_time'].sum()

print(extruding_machine_production_time)

machine_type
extruding machine A    120
extruding machine B    100
extruding machine C    110
extruding machine D     90
Name: production_time, dtype: int64


## Find the extruder machine with the highest production time

In [6]:
# Find the extruder machine with the highest production time
most_productive_extruder_machine = extruding_machine_production_time.idxmax()

print(f"The most productive machine: {most_productive_extruder_machine}")

The most productive machine: extruding machine A


## Calculate the average production time per extruding machine

In [7]:
# Calculate the average production time per extruding machine
extruder_average_production_time = extruding_machine['production_time'].mean()

print(f"The average production considering per extruding machine: {extruder_average_production_time}")

The average production considering per extruding machine: 105.0


## Find the extruder machines that are underperforming (production time below average)

In [8]:
# Find the extruder machines that are underperforming (production time below average)
underperforming_extruder_machines = extruding_machine_production_time[extruding_machine_production_time < extruder_average_production_time].index

print("Underperforming machines: ", list(underperforming_extruder_machines))

Underperforming machines:  ['extruding machine B', 'extruding machine D']


## Move some of the workload from the most productive machine to the underperforming machines

In [9]:
# Move some of the workload from the most productive machine to the underperforming machines
if len(list(underperforming_extruder_machines)) != 0:
    df.loc[df['machine_type'] == most_productive_machine, 'machine_type'] = underperforming_extruder_machines[0]

## Save the updated schedule to a new CSV file

In [10]:
# Save the updated schedule to a new CSV file
df.to_csv('updated_production_schedule.csv', index=False)

This code does the following:

1. Load the production data from a CSV file into a Pandas DataFrame.
2. Calculate the total production time for each machine (e.g. "cutting machine", "extruding machine", etc.) by grouping the data by machine type and summing the production time for each group.
3. Finds the machine type with the highest production time.
4. Calculates the average production time per machine type.
5. Finds the machine types that are underperforming (production time below average).
6. Moves some of the workload from the most productive machine type to the underperforming machine types.
7. Saves the updated schedule to a new CSV file.

## Insights

The fact that the *most productive* extruding machine is `'extruding machine A'` and the *underperforming* machines include `'extruding machine B'` and `'extruding machine D'` suggests that there may be some differences in performance between these machines. It could be that `'extruding machine A'` is more efficient or has better maintenance, leading to higher production output. On the other hand, `'extruding machine B'` and `'extruding machine D'` may be performing poorly due to issues such as a lower production capacity, longer production times, or more frequent breakdowns.

To further understand the differences in performance between these machines, it might be helpful to gather more data on factors such as `production capacity`, `production time`, `maintenance frequency`, and `raw material usage`. This additional data could help identify specific areas where 'extruding machine A' is outperforming the other machines, and suggest potential improvements that could be made to increase the performance of 'extruding machine B' and 'extruding machine D'.