Skip to content

Ashleshk/Data-Engineering-Real-Time-Performance-Optimization

Repository files navigation

Data-Engineering:

Real-Time Performance Optimization for Manufacturing Company

Real-Time Performance Optimization Project with complete streaming data pipeline including acquisition, storage, enrichment, analysis and visualization

  • Credit: Special thanks to AWS & Thoughtworks to conduct "Data-Driven Everything(D2E) Workshop in Chicago". Thank you all for this amazing workshop.

  • Source: AWS Catalog Workshop

Project Objective

AnyCompany Manufacturing is looking to become a world class manufacturing company. To achieve this all machines within the factory should operate at a 85% or higher Overall Operating Efficiency to be considered world class. AnyCompany manufacturing has found that they are are close to this operating target but suspect there are operators of the machine that need improvement.

You as Analyst need to investigate this and work backwards to solve the problem of having a low operational effectiveness efficiency. You will ultimately build a real-time data pipeline resulting in machine insights and visualizations for factory effectiveness.

AWS Services used

  • Amazon Simple Storage Service
  • AWS Glue
  • Amazon Kinesis
  • AWS Lambda
  • Amazon Simple Notification Service (SNS)
  • Amazon QuickSight
  • AWS Identity and Access Management
  • AWS Redshift Serverless
  • Amazon Sagemaker

Proposed Solution

Architecture diagram

Attribute Information

The historical data is made up of the following fields:

line - the factory line id
line_description - description of the type of vehicle parts are made on the line.
operator_name - last name, first name of the employee operating the machine at the time of data entry.
operator_perno - HR personel number of the employee.
assembly_id - vehicle part number being assembled.
machine_model_no - model number of the manufacturing machine.
machine_manufacturer_id - unique id of the specific machine ( concatenation line+machine).
machine - the machine location on a given line
machine_description - description of the area of the vehicle in which the machine is assemblying for.
units_target - target number of units to be produced for a shift.
timestamp - time at which the event was recorded.
wire_tension - amount of tension for the machine's wire (in newtons), this wire helps with assembly.
power_unit - amount of power being generated by the machine (0-100) in kilowatts.
error_code - code of error being produced at the moment.
bit_speed - how fast the machine's rotating bit (to power the machine) is spinning in revolutions per minute (R.P.M).
temperature - the current temperature of the machine in celsius.
units - current number of units produced (the machine has the ability to track when a full unit is produced, allowing a decimal to be shown between each full unit).
defective_count - amount of defects produced at the given time (shown as a decimal but can be converted to a percentage).

Question Addressed & My Findings (Insights for Business decision)

  1. Question - How can we visualize the operating parameters for each machine in the factory?

Answer : Yes, By selecting any data point in the Line Chart, say LO1, Machine 5,

  • this value of the line field for data point is applied as a filter to the box plot visual operated during this time period with wider parameter settings.
  • We also see that the median settings during the time period for Machine 5 were drastically different from the other machines.
  1. Question - Which Machine was most Defective, which operator was responsible?

Answer : Line 1 Machine 5 was most defective among others for frequent period of time.

  • John Doe worked mostly on this machine and Overall, he was responsible for many more defects (count: 4681.46) than his colleagues. concluded from machine view below.
  • Based on this, We can recommended that John receives more training to reduce the overall amount of defects.

Data preparation

Data prepartion


Visualization

  1. Factory View Dashboard Factory View Dashboard

  2. Machine View Dashboard

Machine View Dashboard

About

Real-Time Performance Optimization Project with complete streaming data pipeline including acquisition, storage, enrichment, analysis and visualization

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published