# 04. Pandas 데이터 결합 - 실습 문제

## 실습 안내
- 총 10개 문제
- concat, merge를 활용한 다중 테이블 결합
- 실제 제조 데이터 통합 분석 시나리오
- 생산/품질/설비/정비 데이터 종합 활용

## 데이터 로드

In [372]:
import pandas as pd
import numpy as np

# 데이터 불러오기
production_df = pd.read_csv('../data/05_production.csv', encoding='utf-8-sig')
quality_df = pd.read_csv('../data/07_quality_inspection.csv', encoding='utf-8-sig', na_values=['\\N'])
equipment_df = pd.read_csv('../data/01_equipment.csv', encoding='utf-8-sig')
sensor_df = pd.read_csv('../data/08_sensor_data.csv', encoding='utf-8-sig')
maintenance_df = pd.read_csv('../data/10_maintenance_history.csv', encoding='utf-8-sig')
operation_df = pd.read_csv('../data/06_equipment_operation.csv', encoding='utf-8-sig')
product_df = pd.read_csv('../data/02_product.csv', encoding='utf-8-sig')

# 기본 전처리
production_df['production_date'] = pd.to_datetime(production_df['production_date'])
quality_df['inspection_time'] = pd.to_datetime(quality_df['inspection_time'])

print("데이터 로드 완료!")
print(f"생산: {len(production_df):,}건")
print(f"품질: {len(quality_df):,}건")
print(f"설비: {len(equipment_df):,}건")
print(f"센서: {len(sensor_df):,}건")
print(f"정비: {len(maintenance_df):,}건")
print(f"설비운영: {len(operation_df):,}건")
print(f"제품: {len(product_df):,}건")

데이터 로드 완료!
생산: 1,872건
품질: 37,417건
설비: 5건
센서: 10,920건
정비: 98건
설비운영: 3,304건
제품: 3건


---
## 문제 1: 생산 데이터에 설비 정보 추가

**시나리오**: 생산 데이터에 설비의 상세 정보를 결합하여 분석을 용이하게 하세요.

**요구사항**:
1. production_df와 equipment_df를 equipment_id로 결합 (left join)
2. 필요한 컬럼만 선택:
   - equipment_df에서: equipment_name, equipment_type, capacity # manufacturer
3. 결합된 데이터의 처음 10개 행 출력
4. 결합 전후 데이터 건수 확인

**힌트**: `pd.merge()`, how='left', on='equipment_id'

In [373]:
# 1. production_df와 equipment_df를 equipment_id로 결합 (left join)
prod_eq = pd.merge(production_df, equipment_df, on='equipment_id', how='left')
prod_eq

Unnamed: 0,production_id,equipment_id,product_code,production_date,start_time,end_time,target_quantity,actual_quantity,good_quantity,defect_quantity,...,created_at_x,updated_at_x,equipment_name,equipment_type,location,rated_capacity,installation_date,status,created_at_y,updated_at_y
0,1,INJ-001,BUMPER-A,2024-01-01,2024-01-01 08:14:00,2024-01-01 09:53:32,97,81,77,4,...,2026-01-30 00:42:48,2026-01-30 00:42:48,사출기 1호기,사출기,A동 1라인,150.0,2020-03-15,ACTIVE,2024-01-01 00:00:00,2024-01-01 00:00:00
1,2,INJ-001,BUMPER-A,2024-01-01,2024-01-01 21:02:00,2024-01-01 22:33:43,83,78,72,6,...,2026-01-30 00:42:48,2026-01-30 00:42:48,사출기 1호기,사출기,A동 1라인,150.0,2020-03-15,ACTIVE,2024-01-01 00:00:00,2024-01-01 00:00:00
2,3,INJ-002,BUMPER-A,2024-01-01,2024-01-01 10:12:00,2024-01-01 13:16:28,149,135,132,3,...,2026-01-30 00:42:48,2026-01-30 00:42:48,사출기 2호기,사출기,A동 1라인,150.0,2021-06-20,ACTIVE,2024-01-01 00:00:00,2024-01-01 00:00:00
3,4,INJ-002,DASH-C,2024-01-01,2024-01-01 12:48:00,2024-01-01 15:16:31,100,92,90,2,...,2026-01-30 00:42:48,2026-01-30 00:42:48,사출기 2호기,사출기,A동 1라인,150.0,2021-06-20,ACTIVE,2024-01-01 00:00:00,2024-01-01 00:00:00
4,5,INJ-002,DOOR-B,2024-01-01,2024-01-01 20:48:00,2024-01-01 23:12:13,123,129,122,7,...,2026-01-30 00:42:48,2026-01-30 00:42:48,사출기 2호기,사출기,A동 1라인,150.0,2021-06-20,ACTIVE,2024-01-01 00:00:00,2024-01-01 00:00:00
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
1867,1868,PRESS-002,BUMPER-A,2024-03-31,2024-03-31 20:19:00,2024-03-31 23:25:19,150,144,119,25,...,2026-01-30 00:42:48,2026-01-30 00:42:48,프레스 2호기,프레스,A동 2라인,200.0,2022-08-25,ACTIVE,2024-01-01 00:00:00,2024-01-01 00:00:00
1868,1869,PRESS-002,DASH-C,2024-03-31,2024-04-01 00:15:00,2024-04-01 02:59:58,136,130,109,21,...,2026-01-30 00:42:48,2026-01-30 00:42:48,프레스 2호기,프레스,A동 2라인,200.0,2022-08-25,ACTIVE,2024-01-01 00:00:00,2024-01-01 00:00:00
1869,1870,PRESS-002,BUMPER-A,2024-03-31,2024-04-01 05:53:00,2024-04-01 07:26:15,84,80,66,14,...,2026-01-30 00:42:48,2026-01-30 00:42:48,프레스 2호기,프레스,A동 2라인,200.0,2022-08-25,ACTIVE,2024-01-01 00:00:00,2024-01-01 00:00:00
1870,1871,ASM-001,BUMPER-A,2024-03-31,2024-03-31 10:24:00,2024-03-31 13:25:41,143,121,101,20,...,2026-01-30 00:42:48,2026-01-30 00:42:48,조립라인 1호기,조립라인,B동 1라인,100.0,2020-11-30,ACTIVE,2024-01-01 00:00:00,2024-01-01 00:00:00


In [374]:
prod_eq.columns

Index(['production_id', 'equipment_id', 'product_code', 'production_date',
       'start_time', 'end_time', 'target_quantity', 'actual_quantity',
       'good_quantity', 'defect_quantity', 'cycle_time', 'work_order_no',
       'lot_no', 'operator_id', 'shift', 'created_at_x', 'updated_at_x',
       'equipment_name', 'equipment_type', 'location', 'rated_capacity',
       'installation_date', 'status', 'created_at_y', 'updated_at_y'],
      dtype='object')

In [375]:
# 2. 필요한 컬럼만 선택:
#    - equipment_df에서: equipment_name, equipment_type, capacity # manufacturer
# prod_eq = pd.merge(production_df, equipment_df[['equipment_id', 'equipment_name', 'equipment_type', 'rated_capacity']], on='equipment_id', how='left')
prod_eq.drop(columns=['location', 'installation_date', 'status', 'created_at_y', 'updated_at_y'], inplace=True)
prod_eq

Unnamed: 0,production_id,equipment_id,product_code,production_date,start_time,end_time,target_quantity,actual_quantity,good_quantity,defect_quantity,cycle_time,work_order_no,lot_no,operator_id,shift,created_at_x,updated_at_x,equipment_name,equipment_type,rated_capacity
0,1,INJ-001,BUMPER-A,2024-01-01,2024-01-01 08:14:00,2024-01-01 09:53:32,97,81,77,4,73.73,WO202401019935,LOT2024010100101,OP003,DAY,2026-01-30 00:42:48,2026-01-30 00:42:48,사출기 1호기,사출기,150.0
1,2,INJ-001,BUMPER-A,2024-01-01,2024-01-01 21:02:00,2024-01-01 22:33:43,83,78,72,6,70.56,WO202401012535,LOT2024010100110,OP006,NIGHT,2026-01-30 00:42:48,2026-01-30 00:42:48,사출기 1호기,사출기,150.0
2,3,INJ-002,BUMPER-A,2024-01-01,2024-01-01 10:12:00,2024-01-01 13:16:28,149,135,132,3,81.99,WO202401018359,LOT2024010100201,OP001,DAY,2026-01-30 00:42:48,2026-01-30 00:42:48,사출기 2호기,사출기,150.0
3,4,INJ-002,DASH-C,2024-01-01,2024-01-01 12:48:00,2024-01-01 15:16:31,100,92,90,2,96.87,WO202401016574,LOT2024010100202,OP003,DAY,2026-01-30 00:42:48,2026-01-30 00:42:48,사출기 2호기,사출기,150.0
4,5,INJ-002,DOOR-B,2024-01-01,2024-01-01 20:48:00,2024-01-01 23:12:13,123,129,122,7,67.08,WO202401012674,LOT2024010100210,OP004,NIGHT,2026-01-30 00:42:48,2026-01-30 00:42:48,사출기 2호기,사출기,150.0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
1867,1868,PRESS-002,BUMPER-A,2024-03-31,2024-03-31 20:19:00,2024-03-31 23:25:19,150,144,119,25,77.63,WO202403317101,LOT2024033100210,OP006,NIGHT,2026-01-30 00:42:48,2026-01-30 00:42:48,프레스 2호기,프레스,200.0
1868,1869,PRESS-002,DASH-C,2024-03-31,2024-04-01 00:15:00,2024-04-01 02:59:58,136,130,109,21,76.15,WO202403318434,LOT2024033100211,OP004,NIGHT,2026-01-30 00:42:48,2026-01-30 00:42:48,프레스 2호기,프레스,200.0
1869,1870,PRESS-002,BUMPER-A,2024-03-31,2024-04-01 05:53:00,2024-04-01 07:26:15,84,80,66,14,69.95,WO202403317294,LOT2024033100212,OP004,NIGHT,2026-01-30 00:42:48,2026-01-30 00:42:48,프레스 2호기,프레스,200.0
1870,1871,ASM-001,BUMPER-A,2024-03-31,2024-03-31 10:24:00,2024-03-31 13:25:41,143,121,101,20,90.10,WO202403317268,LOT2024033100101,OP003,DAY,2026-01-30 00:42:48,2026-01-30 00:42:48,조립라인 1호기,조립라인,100.0


In [376]:
# 3. 결합된 데이터의 처음 10개 행 출력
prod_eq.head(10)

Unnamed: 0,production_id,equipment_id,product_code,production_date,start_time,end_time,target_quantity,actual_quantity,good_quantity,defect_quantity,cycle_time,work_order_no,lot_no,operator_id,shift,created_at_x,updated_at_x,equipment_name,equipment_type,rated_capacity
0,1,INJ-001,BUMPER-A,2024-01-01,2024-01-01 08:14:00,2024-01-01 09:53:32,97,81,77,4,73.73,WO202401019935,LOT2024010100101,OP003,DAY,2026-01-30 00:42:48,2026-01-30 00:42:48,사출기 1호기,사출기,150.0
1,2,INJ-001,BUMPER-A,2024-01-01,2024-01-01 21:02:00,2024-01-01 22:33:43,83,78,72,6,70.56,WO202401012535,LOT2024010100110,OP006,NIGHT,2026-01-30 00:42:48,2026-01-30 00:42:48,사출기 1호기,사출기,150.0
2,3,INJ-002,BUMPER-A,2024-01-01,2024-01-01 10:12:00,2024-01-01 13:16:28,149,135,132,3,81.99,WO202401018359,LOT2024010100201,OP001,DAY,2026-01-30 00:42:48,2026-01-30 00:42:48,사출기 2호기,사출기,150.0
3,4,INJ-002,DASH-C,2024-01-01,2024-01-01 12:48:00,2024-01-01 15:16:31,100,92,90,2,96.87,WO202401016574,LOT2024010100202,OP003,DAY,2026-01-30 00:42:48,2026-01-30 00:42:48,사출기 2호기,사출기,150.0
4,5,INJ-002,DOOR-B,2024-01-01,2024-01-01 20:48:00,2024-01-01 23:12:13,123,129,122,7,67.08,WO202401012674,LOT2024010100210,OP004,NIGHT,2026-01-30 00:42:48,2026-01-30 00:42:48,사출기 2호기,사출기,150.0
5,6,INJ-002,BUMPER-A,2024-01-01,2024-01-02 00:22:00,2024-01-02 03:10:05,124,130,123,7,77.58,WO202401015333,LOT2024010100211,OP005,NIGHT,2026-01-30 00:42:48,2026-01-30 00:42:48,사출기 2호기,사출기,150.0
6,7,PRESS-001,DASH-C,2024-01-01,2024-01-01 10:07:00,2024-01-01 12:48:06,128,106,102,4,91.2,WO202401015803,LOT2024010100101,OP010,DAY,2026-01-30 00:42:48,2026-01-30 00:42:48,프레스 1호기,프레스,200.0
7,8,PRESS-001,DASH-C,2024-01-01,2024-01-01 14:12:00,2024-01-01 15:58:01,88,73,70,3,87.14,WO202401014733,LOT2024010100102,OP003,DAY,2026-01-30 00:42:48,2026-01-30 00:42:48,프레스 1호기,프레스,200.0
8,9,PRESS-001,DOOR-B,2024-01-01,2024-01-01 20:55:00,2024-01-01 22:27:01,92,90,84,6,61.35,WO202401017227,LOT2024010100110,OP004,NIGHT,2026-01-30 00:42:48,2026-01-30 00:42:48,프레스 1호기,프레스,200.0
9,10,PRESS-001,DOOR-B,2024-01-01,2024-01-02 02:53:00,2024-01-02 05:09:08,126,123,115,8,66.41,WO202401013664,LOT2024010100111,OP005,NIGHT,2026-01-30 00:42:48,2026-01-30 00:42:48,프레스 1호기,프레스,200.0


In [377]:
# 4. 결합 전후 데이터 건수 확인
print(len(production_df), len(equipment_df), len(prod_eq))

1872 5 1872


---
## 문제 2: 생산 데이터에 제품 정보 추가

**시나리오**: 생산 데이터에 제품 상세 정보를 결합하세요.

**요구사항**:
1. production_df와 product_df를 product_code로 결합 (left join)
2. 제품명(product_name), 카테고리(category) 추가
3. 카테고리별 생산 건수 집계
4. 카테고리별 평균 불량률 계산 (defect_quantity / actual_quantity)

**힌트**: merge 후 groupby로 집계

In [378]:
# 1. production_df와 product_df를 product_code로 결합 (left join)
prod_prod = pd.merge(production_df, product_df, on='product_code', how='left')
prod_prod

Unnamed: 0,production_id,equipment_id,product_code,production_date,start_time,end_time,target_quantity,actual_quantity,good_quantity,defect_quantity,...,updated_at_x,product_name,specification,unit,standard_cycle_time,target_quality_rate,upper_spec_limit,lower_spec_limit,created_at_y,updated_at_y
0,1,INJ-001,BUMPER-A,2024-01-01,2024-01-01 08:14:00,2024-01-01 09:53:32,97,81,77,4,...,2026-01-30 00:42:48,범퍼,1200x300x150mm,EA,75.0,98.5,305.0,295.0,2024-01-01 00:00:00,2024-01-01 00:00:00
1,2,INJ-001,BUMPER-A,2024-01-01,2024-01-01 21:02:00,2024-01-01 22:33:43,83,78,72,6,...,2026-01-30 00:42:48,범퍼,1200x300x150mm,EA,75.0,98.5,305.0,295.0,2024-01-01 00:00:00,2024-01-01 00:00:00
2,3,INJ-002,BUMPER-A,2024-01-01,2024-01-01 10:12:00,2024-01-01 13:16:28,149,135,132,3,...,2026-01-30 00:42:48,범퍼,1200x300x150mm,EA,75.0,98.5,305.0,295.0,2024-01-01 00:00:00,2024-01-01 00:00:00
3,4,INJ-002,DASH-C,2024-01-01,2024-01-01 12:48:00,2024-01-01 15:16:31,100,92,90,2,...,2026-01-30 00:42:48,대시보드,1000x400x80mm,EA,85.0,98.0,405.0,395.0,2024-01-01 00:00:00,2024-01-01 00:00:00
4,5,INJ-002,DOOR-B,2024-01-01,2024-01-01 20:48:00,2024-01-01 23:12:13,123,129,122,7,...,2026-01-30 00:42:48,도어패널,800x600x50mm,EA,65.0,99.0,605.0,595.0,2024-01-01 00:00:00,2024-01-01 00:00:00
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
1867,1868,PRESS-002,BUMPER-A,2024-03-31,2024-03-31 20:19:00,2024-03-31 23:25:19,150,144,119,25,...,2026-01-30 00:42:48,범퍼,1200x300x150mm,EA,75.0,98.5,305.0,295.0,2024-01-01 00:00:00,2024-01-01 00:00:00
1868,1869,PRESS-002,DASH-C,2024-03-31,2024-04-01 00:15:00,2024-04-01 02:59:58,136,130,109,21,...,2026-01-30 00:42:48,대시보드,1000x400x80mm,EA,85.0,98.0,405.0,395.0,2024-01-01 00:00:00,2024-01-01 00:00:00
1869,1870,PRESS-002,BUMPER-A,2024-03-31,2024-04-01 05:53:00,2024-04-01 07:26:15,84,80,66,14,...,2026-01-30 00:42:48,범퍼,1200x300x150mm,EA,75.0,98.5,305.0,295.0,2024-01-01 00:00:00,2024-01-01 00:00:00
1870,1871,ASM-001,BUMPER-A,2024-03-31,2024-03-31 10:24:00,2024-03-31 13:25:41,143,121,101,20,...,2026-01-30 00:42:48,범퍼,1200x300x150mm,EA,75.0,98.5,305.0,295.0,2024-01-01 00:00:00,2024-01-01 00:00:00


In [379]:
# 2. 제품명(product_name), 카테고리(category) 추가
prod_prod = pd.merge(production_df, product_df[['product_code', 'product_name']], on='product_code', how='left')
prod_prod

Unnamed: 0,production_id,equipment_id,product_code,production_date,start_time,end_time,target_quantity,actual_quantity,good_quantity,defect_quantity,cycle_time,work_order_no,lot_no,operator_id,shift,created_at,updated_at,product_name
0,1,INJ-001,BUMPER-A,2024-01-01,2024-01-01 08:14:00,2024-01-01 09:53:32,97,81,77,4,73.73,WO202401019935,LOT2024010100101,OP003,DAY,2026-01-30 00:42:48,2026-01-30 00:42:48,범퍼
1,2,INJ-001,BUMPER-A,2024-01-01,2024-01-01 21:02:00,2024-01-01 22:33:43,83,78,72,6,70.56,WO202401012535,LOT2024010100110,OP006,NIGHT,2026-01-30 00:42:48,2026-01-30 00:42:48,범퍼
2,3,INJ-002,BUMPER-A,2024-01-01,2024-01-01 10:12:00,2024-01-01 13:16:28,149,135,132,3,81.99,WO202401018359,LOT2024010100201,OP001,DAY,2026-01-30 00:42:48,2026-01-30 00:42:48,범퍼
3,4,INJ-002,DASH-C,2024-01-01,2024-01-01 12:48:00,2024-01-01 15:16:31,100,92,90,2,96.87,WO202401016574,LOT2024010100202,OP003,DAY,2026-01-30 00:42:48,2026-01-30 00:42:48,대시보드
4,5,INJ-002,DOOR-B,2024-01-01,2024-01-01 20:48:00,2024-01-01 23:12:13,123,129,122,7,67.08,WO202401012674,LOT2024010100210,OP004,NIGHT,2026-01-30 00:42:48,2026-01-30 00:42:48,도어패널
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
1867,1868,PRESS-002,BUMPER-A,2024-03-31,2024-03-31 20:19:00,2024-03-31 23:25:19,150,144,119,25,77.63,WO202403317101,LOT2024033100210,OP006,NIGHT,2026-01-30 00:42:48,2026-01-30 00:42:48,범퍼
1868,1869,PRESS-002,DASH-C,2024-03-31,2024-04-01 00:15:00,2024-04-01 02:59:58,136,130,109,21,76.15,WO202403318434,LOT2024033100211,OP004,NIGHT,2026-01-30 00:42:48,2026-01-30 00:42:48,대시보드
1869,1870,PRESS-002,BUMPER-A,2024-03-31,2024-04-01 05:53:00,2024-04-01 07:26:15,84,80,66,14,69.95,WO202403317294,LOT2024033100212,OP004,NIGHT,2026-01-30 00:42:48,2026-01-30 00:42:48,범퍼
1870,1871,ASM-001,BUMPER-A,2024-03-31,2024-03-31 10:24:00,2024-03-31 13:25:41,143,121,101,20,90.10,WO202403317268,LOT2024033100101,OP003,DAY,2026-01-30 00:42:48,2026-01-30 00:42:48,범퍼


In [380]:
# 3. 카테고리별 생산 건수 집계
prod_prod.value_counts('product_name')

product_name
범퍼      648
도어패널    641
대시보드    583
Name: count, dtype: int64

In [381]:
# 4. 카테고리별 평균 불량률 계산 (defect_quantity / actual_quantity)
prod_prod['defect_rate'] = (prod_prod['defect_quantity']/prod_prod['actual_quantity']*100).round(2)
prod_prod.groupby('product_name')['defect_rate'].mean().round(2)


product_name
대시보드    10.12
도어패널    10.16
범퍼      10.38
Name: defect_rate, dtype: float64

---
## 문제 3: 생산 데이터와 품질 검사 결합

**시나리오**: 각 생산 건에 대한 품질 검사 정보를 결합하세요.

**요구사항**:
1. production_df와 quality_df를 production_id로 결합 (left join)
2. 결합 후 데이터 건수 확인 (1:N 관계 주의)
3. production_id별 검사 건수 계산
4. 검사가 없는 생산 건 찾기

**힌트**: left join, 결합 후 value_counts(), isna() 확인

In [382]:
# 1. production_df와 quality_df를 production_id로 결합 (left join)
prod_qual = pd.merge(production_df, quality_df, on='production_id', how='left')
prod_qual

Unnamed: 0,production_id,equipment_id_x,product_code_x,production_date,start_time,end_time,target_quantity,actual_quantity,good_quantity,defect_quantity,...,inspection_type,result,defect_code,measurement_value,measurement_unit,inspector_id,lot_no_y,sample_size,notes,created_at_y
0,1,INJ-001,BUMPER-A,2024-01-01,2024-01-01 08:14:00,2024-01-01 09:53:32,97,81,77,4,...,FINAL,PASS,,300.8279,mm,OP007,LOT2024010100101,1,,2026-01-30 01:24:59
1,1,INJ-001,BUMPER-A,2024-01-01,2024-01-01 08:14:00,2024-01-01 09:53:32,97,81,77,4,...,FINAL,PASS,,299.7696,mm,OP008,LOT2024010100101,1,,2026-01-30 01:24:59
2,1,INJ-001,BUMPER-A,2024-01-01,2024-01-01 08:14:00,2024-01-01 09:53:32,97,81,77,4,...,FINAL,PASS,,301.0795,mm,OP007,LOT2024010100101,1,,2026-01-30 01:24:59
3,1,INJ-001,BUMPER-A,2024-01-01,2024-01-01 08:14:00,2024-01-01 09:53:32,97,81,77,4,...,FINAL,PASS,,302.5384,mm,OP007,LOT2024010100101,1,,2026-01-30 01:24:59
4,1,INJ-001,BUMPER-A,2024-01-01,2024-01-01 08:14:00,2024-01-01 09:53:32,97,81,77,4,...,FINAL,PASS,,299.6097,mm,OP007,LOT2024010100101,1,,2026-01-30 01:24:59
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
37412,1872,ASM-001,DOOR-B,2024-03-31,2024-03-31 20:14:00,2024-03-31 22:16:42,106,90,74,16,...,FINAL,FAIL,D002,608.2289,mm,OP008,LOT2024033100110,1,,2026-01-30 01:25:02
37413,1872,ASM-001,DOOR-B,2024-03-31,2024-03-31 20:14:00,2024-03-31 22:16:42,106,90,74,16,...,FINAL,FAIL,D001,609.7039,mm,OP007,LOT2024033100110,1,,2026-01-30 01:25:02
37414,1872,ASM-001,DOOR-B,2024-03-31,2024-03-31 20:14:00,2024-03-31 22:16:42,106,90,74,16,...,FINAL,FAIL,D010,590.4729,mm,OP008,LOT2024033100110,1,,2026-01-30 01:25:02
37415,1872,ASM-001,DOOR-B,2024-03-31,2024-03-31 20:14:00,2024-03-31 22:16:42,106,90,74,16,...,FINAL,FAIL,D001,609.0846,mm,OP008,LOT2024033100110,1,,2026-01-30 01:25:02


In [383]:
# 2. 결합 후 데이터 건수 확인 (1:N 관계 주의)
print(len(production_df), len(quality_df), len(prod_qual))

1872 37417 37417


In [384]:
# 3. production_id별 검사 건수 계산
quality_df.groupby('production_id').agg(inspection_cnt=('production_id', 'count'))

Unnamed: 0_level_0,inspection_cnt
production_id,Unnamed: 1_level_1
1,11
2,13
3,13
4,11
5,17
...,...
1868,35
1869,31
1870,20
1871,30


In [385]:
# 4. 검사가 없는 생산 건 찾기
(~production_df['production_id'].isin(quality_df['production_id'])).sum() # 모두 검사 완료

np.int64(0)

---
## 문제 4: 품질 검사 데이터 집계 후 생산 데이터와 결합

**시나리오**: production_id별 품질 검사 결과를 집계한 후 생산 데이터에 추가하세요.

**요구사항**:
1. quality_df를 production_id로 그룹화하여 집계:
   - 검사 건수
   - 불량 건수 (result='FAIL')
   - 평균 측정값
2. 집계 결과를 production_df와 결합 (left join)
3. 검사가 없는 경우 0으로 채우기
4. 처음 10개 행 출력

**힌트**: groupby + agg, reset_index(), merge, fillna(0)

In [386]:
# 1. quality_df를 production_id로 그룹화하여 집계:
#    - 검사 건수
#    - 불량 건수 (result='FAIL')
#    - 평균 측정값
def defect_cnt(x):
    return (x == 'FAIL').sum()

qual_summary = quality_df.groupby('production_id').agg({'production_id':'count',
                                                        'result':defect_cnt,
                                                        'measurement_value':'mean'}).round(2)
qual_summary.columns = ['검사 건수', '불량 건수', '평균 측정값']
qual_summary

Unnamed: 0_level_0,검사 건수,불량 건수,평균 측정값
production_id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
1,11,4,300.58
2,13,6,299.20
3,13,3,298.30
4,11,2,400.70
5,17,7,598.72
...,...,...,...
1868,35,25,298.00
1869,31,21,400.62
1870,20,14,301.14
1871,30,20,300.78


In [387]:
# 2. 집계 결과를 production_df와 결합 (left join)
pd.merge(production_df, qual_summary, on='production_id', how='left')

Unnamed: 0,production_id,equipment_id,product_code,production_date,start_time,end_time,target_quantity,actual_quantity,good_quantity,defect_quantity,cycle_time,work_order_no,lot_no,operator_id,shift,created_at,updated_at,검사 건수,불량 건수,평균 측정값
0,1,INJ-001,BUMPER-A,2024-01-01,2024-01-01 08:14:00,2024-01-01 09:53:32,97,81,77,4,73.73,WO202401019935,LOT2024010100101,OP003,DAY,2026-01-30 00:42:48,2026-01-30 00:42:48,11,4,300.58
1,2,INJ-001,BUMPER-A,2024-01-01,2024-01-01 21:02:00,2024-01-01 22:33:43,83,78,72,6,70.56,WO202401012535,LOT2024010100110,OP006,NIGHT,2026-01-30 00:42:48,2026-01-30 00:42:48,13,6,299.20
2,3,INJ-002,BUMPER-A,2024-01-01,2024-01-01 10:12:00,2024-01-01 13:16:28,149,135,132,3,81.99,WO202401018359,LOT2024010100201,OP001,DAY,2026-01-30 00:42:48,2026-01-30 00:42:48,13,3,298.30
3,4,INJ-002,DASH-C,2024-01-01,2024-01-01 12:48:00,2024-01-01 15:16:31,100,92,90,2,96.87,WO202401016574,LOT2024010100202,OP003,DAY,2026-01-30 00:42:48,2026-01-30 00:42:48,11,2,400.70
4,5,INJ-002,DOOR-B,2024-01-01,2024-01-01 20:48:00,2024-01-01 23:12:13,123,129,122,7,67.08,WO202401012674,LOT2024010100210,OP004,NIGHT,2026-01-30 00:42:48,2026-01-30 00:42:48,17,7,598.72
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
1867,1868,PRESS-002,BUMPER-A,2024-03-31,2024-03-31 20:19:00,2024-03-31 23:25:19,150,144,119,25,77.63,WO202403317101,LOT2024033100210,OP006,NIGHT,2026-01-30 00:42:48,2026-01-30 00:42:48,35,25,298.00
1868,1869,PRESS-002,DASH-C,2024-03-31,2024-04-01 00:15:00,2024-04-01 02:59:58,136,130,109,21,76.15,WO202403318434,LOT2024033100211,OP004,NIGHT,2026-01-30 00:42:48,2026-01-30 00:42:48,31,21,400.62
1869,1870,PRESS-002,BUMPER-A,2024-03-31,2024-04-01 05:53:00,2024-04-01 07:26:15,84,80,66,14,69.95,WO202403317294,LOT2024033100212,OP004,NIGHT,2026-01-30 00:42:48,2026-01-30 00:42:48,20,14,301.14
1870,1871,ASM-001,BUMPER-A,2024-03-31,2024-03-31 10:24:00,2024-03-31 13:25:41,143,121,101,20,90.10,WO202403317268,LOT2024033100101,OP003,DAY,2026-01-30 00:42:48,2026-01-30 00:42:48,30,20,300.78


In [388]:
# 3. 검사가 없는 경우 0으로 채우기
pd.merge(qual_summary, production_df, on='production_id', how='left').fillna(0)

Unnamed: 0,production_id,검사 건수,불량 건수,평균 측정값,equipment_id,product_code,production_date,start_time,end_time,target_quantity,actual_quantity,good_quantity,defect_quantity,cycle_time,work_order_no,lot_no,operator_id,shift,created_at,updated_at
0,1,11,4,300.58,INJ-001,BUMPER-A,2024-01-01,2024-01-01 08:14:00,2024-01-01 09:53:32,97,81,77,4,73.73,WO202401019935,LOT2024010100101,OP003,DAY,2026-01-30 00:42:48,2026-01-30 00:42:48
1,2,13,6,299.20,INJ-001,BUMPER-A,2024-01-01,2024-01-01 21:02:00,2024-01-01 22:33:43,83,78,72,6,70.56,WO202401012535,LOT2024010100110,OP006,NIGHT,2026-01-30 00:42:48,2026-01-30 00:42:48
2,3,13,3,298.30,INJ-002,BUMPER-A,2024-01-01,2024-01-01 10:12:00,2024-01-01 13:16:28,149,135,132,3,81.99,WO202401018359,LOT2024010100201,OP001,DAY,2026-01-30 00:42:48,2026-01-30 00:42:48
3,4,11,2,400.70,INJ-002,DASH-C,2024-01-01,2024-01-01 12:48:00,2024-01-01 15:16:31,100,92,90,2,96.87,WO202401016574,LOT2024010100202,OP003,DAY,2026-01-30 00:42:48,2026-01-30 00:42:48
4,5,17,7,598.72,INJ-002,DOOR-B,2024-01-01,2024-01-01 20:48:00,2024-01-01 23:12:13,123,129,122,7,67.08,WO202401012674,LOT2024010100210,OP004,NIGHT,2026-01-30 00:42:48,2026-01-30 00:42:48
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
1867,1868,35,25,298.00,PRESS-002,BUMPER-A,2024-03-31,2024-03-31 20:19:00,2024-03-31 23:25:19,150,144,119,25,77.63,WO202403317101,LOT2024033100210,OP006,NIGHT,2026-01-30 00:42:48,2026-01-30 00:42:48
1868,1869,31,21,400.62,PRESS-002,DASH-C,2024-03-31,2024-04-01 00:15:00,2024-04-01 02:59:58,136,130,109,21,76.15,WO202403318434,LOT2024033100211,OP004,NIGHT,2026-01-30 00:42:48,2026-01-30 00:42:48
1869,1870,20,14,301.14,PRESS-002,BUMPER-A,2024-03-31,2024-04-01 05:53:00,2024-04-01 07:26:15,84,80,66,14,69.95,WO202403317294,LOT2024033100212,OP004,NIGHT,2026-01-30 00:42:48,2026-01-30 00:42:48
1870,1871,30,20,300.78,ASM-001,BUMPER-A,2024-03-31,2024-03-31 10:24:00,2024-03-31 13:25:41,143,121,101,20,90.10,WO202403317268,LOT2024033100101,OP003,DAY,2026-01-30 00:42:48,2026-01-30 00:42:48


In [389]:
# 4. 처음 10개 행 출력
pd.merge(qual_summary, production_df, on='production_id', how='left').fillna(0).head(10)

Unnamed: 0,production_id,검사 건수,불량 건수,평균 측정값,equipment_id,product_code,production_date,start_time,end_time,target_quantity,actual_quantity,good_quantity,defect_quantity,cycle_time,work_order_no,lot_no,operator_id,shift,created_at,updated_at
0,1,11,4,300.58,INJ-001,BUMPER-A,2024-01-01,2024-01-01 08:14:00,2024-01-01 09:53:32,97,81,77,4,73.73,WO202401019935,LOT2024010100101,OP003,DAY,2026-01-30 00:42:48,2026-01-30 00:42:48
1,2,13,6,299.2,INJ-001,BUMPER-A,2024-01-01,2024-01-01 21:02:00,2024-01-01 22:33:43,83,78,72,6,70.56,WO202401012535,LOT2024010100110,OP006,NIGHT,2026-01-30 00:42:48,2026-01-30 00:42:48
2,3,13,3,298.3,INJ-002,BUMPER-A,2024-01-01,2024-01-01 10:12:00,2024-01-01 13:16:28,149,135,132,3,81.99,WO202401018359,LOT2024010100201,OP001,DAY,2026-01-30 00:42:48,2026-01-30 00:42:48
3,4,11,2,400.7,INJ-002,DASH-C,2024-01-01,2024-01-01 12:48:00,2024-01-01 15:16:31,100,92,90,2,96.87,WO202401016574,LOT2024010100202,OP003,DAY,2026-01-30 00:42:48,2026-01-30 00:42:48
4,5,17,7,598.72,INJ-002,DOOR-B,2024-01-01,2024-01-01 20:48:00,2024-01-01 23:12:13,123,129,122,7,67.08,WO202401012674,LOT2024010100210,OP004,NIGHT,2026-01-30 00:42:48,2026-01-30 00:42:48
5,6,17,7,301.32,INJ-002,BUMPER-A,2024-01-01,2024-01-02 00:22:00,2024-01-02 03:10:05,124,130,123,7,77.58,WO202401015333,LOT2024010100211,OP005,NIGHT,2026-01-30 00:42:48,2026-01-30 00:42:48
6,7,14,4,402.0,PRESS-001,DASH-C,2024-01-01,2024-01-01 10:07:00,2024-01-01 12:48:06,128,106,102,4,91.2,WO202401015803,LOT2024010100101,OP010,DAY,2026-01-30 00:42:48,2026-01-30 00:42:48
7,8,10,3,402.01,PRESS-001,DASH-C,2024-01-01,2024-01-01 14:12:00,2024-01-01 15:58:01,88,73,70,3,87.14,WO202401014733,LOT2024010100102,OP003,DAY,2026-01-30 00:42:48,2026-01-30 00:42:48
8,9,14,6,601.63,PRESS-001,DOOR-B,2024-01-01,2024-01-01 20:55:00,2024-01-01 22:27:01,92,90,84,6,61.35,WO202401017227,LOT2024010100110,OP004,NIGHT,2026-01-30 00:42:48,2026-01-30 00:42:48
9,10,18,8,601.27,PRESS-001,DOOR-B,2024-01-01,2024-01-02 02:53:00,2024-01-02 05:09:08,126,123,115,8,66.41,WO202401013664,LOT2024010100111,OP005,NIGHT,2026-01-30 00:42:48,2026-01-30 00:42:48


---
## 문제 5: 설비 + 생산 + 정비 종합 분석

**시나리오**: 설비별 생산 실적과 정비 이력을 종합하여 설비 성능을 평가하세요.

**요구사항**:
1. production_df를 설비별로 집계:
   - 생산 건수, 총 생산량
2. maintenance_df를 설비별로 집계:
   - 정비 건수, 총 정비 비용, 총 정지 시간
3. equipment_df에 위 두 집계 결과를 순차적으로 결합 (left join)
4. 결측치는 0으로 채우기
5. 설비당 평균 정비 비용 계산

**힌트**: 각각 집계 후 reset_index(), 순차적 merge

In [390]:
# 1. production_df를 설비별로 집계:
#    - 생산 건수, 총 생산량
prod_summary = production_df.groupby('equipment_id').agg({'production_id':'count',
                                                          'actual_quantity':'sum'})
prod_summary.columns = ['생산 건수', '총 생산량']
prod_summary

Unnamed: 0_level_0,생산 건수,총 생산량
equipment_id,Unnamed: 1_level_1,Unnamed: 2_level_1
ASM-001,234,22485
INJ-001,262,28163
INJ-002,430,51958
PRESS-001,468,52069
PRESS-002,478,51929


In [391]:
# 2. maintenance_df를 설비별로 집계:
#    - 정비 건수, 총 정비 비용, 총 정지 시간
main_summary = maintenance_df.groupby('equipment_id').agg({'maintenance_type':'count',
                                                           'cost':'sum',
                                                           'downtime_hours':'sum'}).round(2)
main_summary.columns = ['정비 건수', '총 정비 비용', '총 정지 시간']
main_summary

Unnamed: 0_level_0,정비 건수,총 정비 비용,총 정지 시간
equipment_id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
ASM-001,19,60072924.01,30.79
INJ-001,17,45291758.85,27.85
INJ-002,23,75076538.5,40.36
PRESS-001,18,44564270.41,29.69
PRESS-002,21,70921381.6,37.71


In [392]:
# 3. equipment_df에 위 두 집계 결과를 순차적으로 결합 (left join)
eq_all = pd.merge(equipment_df, prod_summary, on='equipment_id', how='left')
eq_all = pd.merge(eq_all, main_summary, on='equipment_id', how='left')
eq_all

Unnamed: 0,equipment_id,equipment_name,equipment_type,location,rated_capacity,installation_date,status,created_at,updated_at,생산 건수,총 생산량,정비 건수,총 정비 비용,총 정지 시간
0,INJ-001,사출기 1호기,사출기,A동 1라인,150.0,2020-03-15,ACTIVE,2024-01-01 00:00:00,2024-01-01 00:00:00,262,28163,17,45291758.85,27.85
1,INJ-002,사출기 2호기,사출기,A동 1라인,150.0,2021-06-20,ACTIVE,2024-01-01 00:00:00,2024-01-01 00:00:00,430,51958,23,75076538.5,40.36
2,PRESS-001,프레스 1호기,프레스,A동 2라인,200.0,2019-05-10,ACTIVE,2024-01-01 00:00:00,2024-01-01 00:00:00,468,52069,18,44564270.41,29.69
3,PRESS-002,프레스 2호기,프레스,A동 2라인,200.0,2022-08-25,ACTIVE,2024-01-01 00:00:00,2024-01-01 00:00:00,478,51929,21,70921381.6,37.71
4,ASM-001,조립라인 1호기,조립라인,B동 1라인,100.0,2020-11-30,ACTIVE,2024-01-01 00:00:00,2024-01-01 00:00:00,234,22485,19,60072924.01,30.79


In [393]:
# 4. 결측치는 0으로 채우기
eq_all.fillna(0)

Unnamed: 0,equipment_id,equipment_name,equipment_type,location,rated_capacity,installation_date,status,created_at,updated_at,생산 건수,총 생산량,정비 건수,총 정비 비용,총 정지 시간
0,INJ-001,사출기 1호기,사출기,A동 1라인,150.0,2020-03-15,ACTIVE,2024-01-01 00:00:00,2024-01-01 00:00:00,262,28163,17,45291758.85,27.85
1,INJ-002,사출기 2호기,사출기,A동 1라인,150.0,2021-06-20,ACTIVE,2024-01-01 00:00:00,2024-01-01 00:00:00,430,51958,23,75076538.5,40.36
2,PRESS-001,프레스 1호기,프레스,A동 2라인,200.0,2019-05-10,ACTIVE,2024-01-01 00:00:00,2024-01-01 00:00:00,468,52069,18,44564270.41,29.69
3,PRESS-002,프레스 2호기,프레스,A동 2라인,200.0,2022-08-25,ACTIVE,2024-01-01 00:00:00,2024-01-01 00:00:00,478,51929,21,70921381.6,37.71
4,ASM-001,조립라인 1호기,조립라인,B동 1라인,100.0,2020-11-30,ACTIVE,2024-01-01 00:00:00,2024-01-01 00:00:00,234,22485,19,60072924.01,30.79


In [394]:
# 5. 설비당 평균 정비 비용 계산
maintenance_df.groupby('equipment_id').agg({'cost':'mean'}).round(2)

Unnamed: 0_level_0,cost
equipment_id,Unnamed: 1_level_1
ASM-001,3161732.84
INJ-001,2664221.11
INJ-002,3264197.33
PRESS-001,2475792.8
PRESS-002,3377208.65


---
## 문제 6: 생산량과 설비 capacity 비교

**시나리오**: 설비의 capacity 대비 실제 생산량을 비교하여 가동률을 분석하세요.

**요구사항**:
1. production_df와 equipment_df를 결합 (설비 정보 추가)
2. 각 생산 건의 capacity 대비 생산량 비율 계산:
   - capacity_utilization = (actual_quantity / capacity) * 100
3. 설비별 평균 capacity_utilization 계산
4. 가동률이 가장 높은 상위 5개 설비 출력

**힌트**: merge, 새 컬럼 생성, groupby, sort_values()

In [395]:
# 1. production_df와 equipment_df를 결합 (설비 정보 추가)
prod_eq2 = pd.merge(production_df, equipment_df, on='equipment_id', how='left')

In [396]:
# 2. 각 생산 건의 capacity 대비 생산량 비율 계산:
#    - capacity_utilization = (actual_quantity / capacity) * 100
prod_eq2['capacity_utilization'] = ((prod_eq2['actual_quantity']/prod_eq2['rated_capacity'])*100).round(2)
prod_eq2

Unnamed: 0,production_id,equipment_id,product_code,production_date,start_time,end_time,target_quantity,actual_quantity,good_quantity,defect_quantity,...,updated_at_x,equipment_name,equipment_type,location,rated_capacity,installation_date,status,created_at_y,updated_at_y,capacity_utilization
0,1,INJ-001,BUMPER-A,2024-01-01,2024-01-01 08:14:00,2024-01-01 09:53:32,97,81,77,4,...,2026-01-30 00:42:48,사출기 1호기,사출기,A동 1라인,150.0,2020-03-15,ACTIVE,2024-01-01 00:00:00,2024-01-01 00:00:00,54.00
1,2,INJ-001,BUMPER-A,2024-01-01,2024-01-01 21:02:00,2024-01-01 22:33:43,83,78,72,6,...,2026-01-30 00:42:48,사출기 1호기,사출기,A동 1라인,150.0,2020-03-15,ACTIVE,2024-01-01 00:00:00,2024-01-01 00:00:00,52.00
2,3,INJ-002,BUMPER-A,2024-01-01,2024-01-01 10:12:00,2024-01-01 13:16:28,149,135,132,3,...,2026-01-30 00:42:48,사출기 2호기,사출기,A동 1라인,150.0,2021-06-20,ACTIVE,2024-01-01 00:00:00,2024-01-01 00:00:00,90.00
3,4,INJ-002,DASH-C,2024-01-01,2024-01-01 12:48:00,2024-01-01 15:16:31,100,92,90,2,...,2026-01-30 00:42:48,사출기 2호기,사출기,A동 1라인,150.0,2021-06-20,ACTIVE,2024-01-01 00:00:00,2024-01-01 00:00:00,61.33
4,5,INJ-002,DOOR-B,2024-01-01,2024-01-01 20:48:00,2024-01-01 23:12:13,123,129,122,7,...,2026-01-30 00:42:48,사출기 2호기,사출기,A동 1라인,150.0,2021-06-20,ACTIVE,2024-01-01 00:00:00,2024-01-01 00:00:00,86.00
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
1867,1868,PRESS-002,BUMPER-A,2024-03-31,2024-03-31 20:19:00,2024-03-31 23:25:19,150,144,119,25,...,2026-01-30 00:42:48,프레스 2호기,프레스,A동 2라인,200.0,2022-08-25,ACTIVE,2024-01-01 00:00:00,2024-01-01 00:00:00,72.00
1868,1869,PRESS-002,DASH-C,2024-03-31,2024-04-01 00:15:00,2024-04-01 02:59:58,136,130,109,21,...,2026-01-30 00:42:48,프레스 2호기,프레스,A동 2라인,200.0,2022-08-25,ACTIVE,2024-01-01 00:00:00,2024-01-01 00:00:00,65.00
1869,1870,PRESS-002,BUMPER-A,2024-03-31,2024-04-01 05:53:00,2024-04-01 07:26:15,84,80,66,14,...,2026-01-30 00:42:48,프레스 2호기,프레스,A동 2라인,200.0,2022-08-25,ACTIVE,2024-01-01 00:00:00,2024-01-01 00:00:00,40.00
1870,1871,ASM-001,BUMPER-A,2024-03-31,2024-03-31 10:24:00,2024-03-31 13:25:41,143,121,101,20,...,2026-01-30 00:42:48,조립라인 1호기,조립라인,B동 1라인,100.0,2020-11-30,ACTIVE,2024-01-01 00:00:00,2024-01-01 00:00:00,121.00


In [397]:
# 3. 설비별 평균 capacity_utilization 계산
prod_eq2.groupby('equipment_id').agg({'capacity_utilization':'mean'}).round(2)

Unnamed: 0_level_0,capacity_utilization
equipment_id,Unnamed: 1_level_1
ASM-001,96.09
INJ-001,71.66
INJ-002,80.56
PRESS-001,55.63
PRESS-002,54.32


In [398]:
# 4. 가동률이 가장 높은 상위 5개 설비 출력
prod_eq2.groupby('equipment_id').agg({'capacity_utilization':'mean'}).round(2).sort_values('capacity_utilization', ascending=False)

Unnamed: 0_level_0,capacity_utilization
equipment_id,Unnamed: 1_level_1
ASM-001,96.09
INJ-002,80.56
INJ-001,71.66
PRESS-001,55.63
PRESS-002,54.32


---
## 문제 7: 제조사별 설비 성능 분석

**시나리오**: 설비 제조사별로 생산 성능을 비교 분석하세요.

**요구사항**:
1. production_df에 equipment_df를 결합 (manufacturer 정보 추가)
2. 제조사별로 다음 집계:
   - 설비 수 (equipment_id의 unique count)
   - 총 생산량
   - 평균 불량률
   - 평균 사이클 타임
3. 총 생산량 기준 내림차순 정렬

**힌트**: merge, groupby with nunique, agg

In [399]:
# 제조사 데이터 없음

---
## 문제 8: 월별 데이터 분할 후 재결합

**시나리오**: 생산 데이터를 월별로 분할하여 각각 분석한 후 다시 합치세요.

**요구사항**:
1. production_df를 1월, 2월, 3월 데이터로 분할
2. 각 월의 데이터에 '분기' 컬럼 추가 (모두 'Q1')
3. concat으로 세 DataFrame을 세로로 결합
4. 결합 후 인덱스 재설정 (ignore_index=True)
5. 분기별 생산 건수 확인

**힌트**: 월별 필터링, concat([df1, df2, df3], axis=0)

In [400]:
# 1. production_df를 1월, 2월, 3월 데이터로 분할
production_01 = production_df[production_df['production_date'].dt.month == 1].copy()
production_02 = production_df[production_df['production_date'].dt.month == 2].copy()
production_03 = production_df[production_df['production_date'].dt.month == 3].copy()

In [401]:
# 2. 각 월의 데이터에 '분기' 컬럼 추가 (모두 'Q1')
production_01['분기'] = 'Q1'
production_02['분기'] = 'Q1'
production_03['분기'] = 'Q1'
production_03

Unnamed: 0,production_id,equipment_id,product_code,production_date,start_time,end_time,target_quantity,actual_quantity,good_quantity,defect_quantity,cycle_time,work_order_no,lot_no,operator_id,shift,created_at,updated_at,분기
1228,1229,INJ-001,BUMPER-A,2024-03-01,2024-03-01 09:31:00,2024-03-01 12:13:16,135,128,110,18,76.07,WO202403011606,LOT2024030100101,OP003,DAY,2026-01-30 00:42:48,2026-01-30 00:42:48,Q1
1229,1230,INJ-001,BUMPER-A,2024-03-01,2024-03-01 21:08:00,2024-03-01 22:54:59,83,78,64,14,82.30,WO202403015551,LOT2024030100110,OP004,NIGHT,2026-01-30 00:42:48,2026-01-30 00:42:48,Q1
1230,1231,INJ-002,DASH-C,2024-03-01,2024-03-01 10:31:00,2024-03-01 12:34:55,82,86,74,12,86.46,WO202403012375,LOT2024030100201,OP001,DAY,2026-01-30 00:42:48,2026-01-30 00:42:48,Q1
1231,1232,INJ-002,DOOR-B,2024-03-01,2024-03-01 12:57:00,2024-03-01 15:18:13,132,138,123,15,61.40,WO202403012479,LOT2024030100202,OP003,DAY,2026-01-30 00:42:48,2026-01-30 00:42:48,Q1
1232,1233,INJ-002,DOOR-B,2024-03-01,2024-03-01 20:15:00,2024-03-01 22:05:45,94,98,84,14,67.81,WO202403019196,LOT2024030100210,OP006,NIGHT,2026-01-30 00:42:48,2026-01-30 00:42:48,Q1
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
1867,1868,PRESS-002,BUMPER-A,2024-03-31,2024-03-31 20:19:00,2024-03-31 23:25:19,150,144,119,25,77.63,WO202403317101,LOT2024033100210,OP006,NIGHT,2026-01-30 00:42:48,2026-01-30 00:42:48,Q1
1868,1869,PRESS-002,DASH-C,2024-03-31,2024-04-01 00:15:00,2024-04-01 02:59:58,136,130,109,21,76.15,WO202403318434,LOT2024033100211,OP004,NIGHT,2026-01-30 00:42:48,2026-01-30 00:42:48,Q1
1869,1870,PRESS-002,BUMPER-A,2024-03-31,2024-04-01 05:53:00,2024-04-01 07:26:15,84,80,66,14,69.95,WO202403317294,LOT2024033100212,OP004,NIGHT,2026-01-30 00:42:48,2026-01-30 00:42:48,Q1
1870,1871,ASM-001,BUMPER-A,2024-03-31,2024-03-31 10:24:00,2024-03-31 13:25:41,143,121,101,20,90.10,WO202403317268,LOT2024033100101,OP003,DAY,2026-01-30 00:42:48,2026-01-30 00:42:48,Q1


In [402]:
# 3. concat으로 세 DataFrame을 세로로 결합
pd.concat([production_01, production_02, production_03])

Unnamed: 0,production_id,equipment_id,product_code,production_date,start_time,end_time,target_quantity,actual_quantity,good_quantity,defect_quantity,cycle_time,work_order_no,lot_no,operator_id,shift,created_at,updated_at,분기
0,1,INJ-001,BUMPER-A,2024-01-01,2024-01-01 08:14:00,2024-01-01 09:53:32,97,81,77,4,73.73,WO202401019935,LOT2024010100101,OP003,DAY,2026-01-30 00:42:48,2026-01-30 00:42:48,Q1
1,2,INJ-001,BUMPER-A,2024-01-01,2024-01-01 21:02:00,2024-01-01 22:33:43,83,78,72,6,70.56,WO202401012535,LOT2024010100110,OP006,NIGHT,2026-01-30 00:42:48,2026-01-30 00:42:48,Q1
2,3,INJ-002,BUMPER-A,2024-01-01,2024-01-01 10:12:00,2024-01-01 13:16:28,149,135,132,3,81.99,WO202401018359,LOT2024010100201,OP001,DAY,2026-01-30 00:42:48,2026-01-30 00:42:48,Q1
3,4,INJ-002,DASH-C,2024-01-01,2024-01-01 12:48:00,2024-01-01 15:16:31,100,92,90,2,96.87,WO202401016574,LOT2024010100202,OP003,DAY,2026-01-30 00:42:48,2026-01-30 00:42:48,Q1
4,5,INJ-002,DOOR-B,2024-01-01,2024-01-01 20:48:00,2024-01-01 23:12:13,123,129,122,7,67.08,WO202401012674,LOT2024010100210,OP004,NIGHT,2026-01-30 00:42:48,2026-01-30 00:42:48,Q1
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
1867,1868,PRESS-002,BUMPER-A,2024-03-31,2024-03-31 20:19:00,2024-03-31 23:25:19,150,144,119,25,77.63,WO202403317101,LOT2024033100210,OP006,NIGHT,2026-01-30 00:42:48,2026-01-30 00:42:48,Q1
1868,1869,PRESS-002,DASH-C,2024-03-31,2024-04-01 00:15:00,2024-04-01 02:59:58,136,130,109,21,76.15,WO202403318434,LOT2024033100211,OP004,NIGHT,2026-01-30 00:42:48,2026-01-30 00:42:48,Q1
1869,1870,PRESS-002,BUMPER-A,2024-03-31,2024-04-01 05:53:00,2024-04-01 07:26:15,84,80,66,14,69.95,WO202403317294,LOT2024033100212,OP004,NIGHT,2026-01-30 00:42:48,2026-01-30 00:42:48,Q1
1870,1871,ASM-001,BUMPER-A,2024-03-31,2024-03-31 10:24:00,2024-03-31 13:25:41,143,121,101,20,90.10,WO202403317268,LOT2024033100101,OP003,DAY,2026-01-30 00:42:48,2026-01-30 00:42:48,Q1


In [403]:
# 4. 결합 후 인덱스 재설정 (ignore_index=True)
pd.concat([production_01, production_02, production_03], ignore_index=True)

Unnamed: 0,production_id,equipment_id,product_code,production_date,start_time,end_time,target_quantity,actual_quantity,good_quantity,defect_quantity,cycle_time,work_order_no,lot_no,operator_id,shift,created_at,updated_at,분기
0,1,INJ-001,BUMPER-A,2024-01-01,2024-01-01 08:14:00,2024-01-01 09:53:32,97,81,77,4,73.73,WO202401019935,LOT2024010100101,OP003,DAY,2026-01-30 00:42:48,2026-01-30 00:42:48,Q1
1,2,INJ-001,BUMPER-A,2024-01-01,2024-01-01 21:02:00,2024-01-01 22:33:43,83,78,72,6,70.56,WO202401012535,LOT2024010100110,OP006,NIGHT,2026-01-30 00:42:48,2026-01-30 00:42:48,Q1
2,3,INJ-002,BUMPER-A,2024-01-01,2024-01-01 10:12:00,2024-01-01 13:16:28,149,135,132,3,81.99,WO202401018359,LOT2024010100201,OP001,DAY,2026-01-30 00:42:48,2026-01-30 00:42:48,Q1
3,4,INJ-002,DASH-C,2024-01-01,2024-01-01 12:48:00,2024-01-01 15:16:31,100,92,90,2,96.87,WO202401016574,LOT2024010100202,OP003,DAY,2026-01-30 00:42:48,2026-01-30 00:42:48,Q1
4,5,INJ-002,DOOR-B,2024-01-01,2024-01-01 20:48:00,2024-01-01 23:12:13,123,129,122,7,67.08,WO202401012674,LOT2024010100210,OP004,NIGHT,2026-01-30 00:42:48,2026-01-30 00:42:48,Q1
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
1867,1868,PRESS-002,BUMPER-A,2024-03-31,2024-03-31 20:19:00,2024-03-31 23:25:19,150,144,119,25,77.63,WO202403317101,LOT2024033100210,OP006,NIGHT,2026-01-30 00:42:48,2026-01-30 00:42:48,Q1
1868,1869,PRESS-002,DASH-C,2024-03-31,2024-04-01 00:15:00,2024-04-01 02:59:58,136,130,109,21,76.15,WO202403318434,LOT2024033100211,OP004,NIGHT,2026-01-30 00:42:48,2026-01-30 00:42:48,Q1
1869,1870,PRESS-002,BUMPER-A,2024-03-31,2024-04-01 05:53:00,2024-04-01 07:26:15,84,80,66,14,69.95,WO202403317294,LOT2024033100212,OP004,NIGHT,2026-01-30 00:42:48,2026-01-30 00:42:48,Q1
1870,1871,ASM-001,BUMPER-A,2024-03-31,2024-03-31 10:24:00,2024-03-31 13:25:41,143,121,101,20,90.10,WO202403317268,LOT2024033100101,OP003,DAY,2026-01-30 00:42:48,2026-01-30 00:42:48,Q1


In [404]:
# 5. 분기별 생산 건수 확인
pd.concat([production_01, production_02, production_03], ignore_index=True)['분기'].value_counts()

분기
Q1    1872
Name: count, dtype: int64

---
## 문제 9: 생산-품질-설비 3중 결합

**시나리오**: 생산, 품질, 설비 데이터를 모두 결합하여 종합 분석 테이블을 만드세요.

**요구사항**:
1. 품질 데이터를 production_id별로 집계:
   - 검사 건수, 불량 건수, 평균 측정값
2. 생산 데이터에 품질 집계 결합 (left join)
3. 결과에 설비 정보 결합 (equipment_name, equipment_type 추가)
4. 설비 타입별 평균 불량률 계산
5. 결과를 설비 타입으로 그룹화하여 출력

**힌트**: 순차적 merge, 중간 결과 확인하며 진행

In [405]:
# 1. 품질 데이터를 production_id별로 집계:
#    - 검사 건수, 불량 건수, 평균 측정값
def defect_cnt(x):
    return (x == 'FAIL').sum()

qual_summary = quality_df.groupby('production_id').agg({'production_id':'count',
                                                        'result':defect_cnt,
                                                        'measurement_value':'mean'}).round(2)
qual_summary.columns = ['검사 건수', '불량 건수', '평균 측정값']
qual_summary

Unnamed: 0_level_0,검사 건수,불량 건수,평균 측정값
production_id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
1,11,4,300.58
2,13,6,299.20
3,13,3,298.30
4,11,2,400.70
5,17,7,598.72
...,...,...,...
1868,35,25,298.00
1869,31,21,400.62
1870,20,14,301.14
1871,30,20,300.78


In [406]:
# 2. 생산 데이터에 품질 집계 결합 (left join)
prod_qual_eq = pd.merge(production_df, qual_summary, on='production_id', how='left')
prod_qual_eq

Unnamed: 0,production_id,equipment_id,product_code,production_date,start_time,end_time,target_quantity,actual_quantity,good_quantity,defect_quantity,cycle_time,work_order_no,lot_no,operator_id,shift,created_at,updated_at,검사 건수,불량 건수,평균 측정값
0,1,INJ-001,BUMPER-A,2024-01-01,2024-01-01 08:14:00,2024-01-01 09:53:32,97,81,77,4,73.73,WO202401019935,LOT2024010100101,OP003,DAY,2026-01-30 00:42:48,2026-01-30 00:42:48,11,4,300.58
1,2,INJ-001,BUMPER-A,2024-01-01,2024-01-01 21:02:00,2024-01-01 22:33:43,83,78,72,6,70.56,WO202401012535,LOT2024010100110,OP006,NIGHT,2026-01-30 00:42:48,2026-01-30 00:42:48,13,6,299.20
2,3,INJ-002,BUMPER-A,2024-01-01,2024-01-01 10:12:00,2024-01-01 13:16:28,149,135,132,3,81.99,WO202401018359,LOT2024010100201,OP001,DAY,2026-01-30 00:42:48,2026-01-30 00:42:48,13,3,298.30
3,4,INJ-002,DASH-C,2024-01-01,2024-01-01 12:48:00,2024-01-01 15:16:31,100,92,90,2,96.87,WO202401016574,LOT2024010100202,OP003,DAY,2026-01-30 00:42:48,2026-01-30 00:42:48,11,2,400.70
4,5,INJ-002,DOOR-B,2024-01-01,2024-01-01 20:48:00,2024-01-01 23:12:13,123,129,122,7,67.08,WO202401012674,LOT2024010100210,OP004,NIGHT,2026-01-30 00:42:48,2026-01-30 00:42:48,17,7,598.72
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
1867,1868,PRESS-002,BUMPER-A,2024-03-31,2024-03-31 20:19:00,2024-03-31 23:25:19,150,144,119,25,77.63,WO202403317101,LOT2024033100210,OP006,NIGHT,2026-01-30 00:42:48,2026-01-30 00:42:48,35,25,298.00
1868,1869,PRESS-002,DASH-C,2024-03-31,2024-04-01 00:15:00,2024-04-01 02:59:58,136,130,109,21,76.15,WO202403318434,LOT2024033100211,OP004,NIGHT,2026-01-30 00:42:48,2026-01-30 00:42:48,31,21,400.62
1869,1870,PRESS-002,BUMPER-A,2024-03-31,2024-04-01 05:53:00,2024-04-01 07:26:15,84,80,66,14,69.95,WO202403317294,LOT2024033100212,OP004,NIGHT,2026-01-30 00:42:48,2026-01-30 00:42:48,20,14,301.14
1870,1871,ASM-001,BUMPER-A,2024-03-31,2024-03-31 10:24:00,2024-03-31 13:25:41,143,121,101,20,90.10,WO202403317268,LOT2024033100101,OP003,DAY,2026-01-30 00:42:48,2026-01-30 00:42:48,30,20,300.78


In [407]:
# 3. 결과에 설비 정보 결합 (equipment_name, equipment_type 추가)
prod_qual_eq = pd.merge(prod_qual_eq, equipment_df.loc[:, 'equipment_id':'equipment_type'], on='equipment_id', how='left')
prod_qual_eq

Unnamed: 0,production_id,equipment_id,product_code,production_date,start_time,end_time,target_quantity,actual_quantity,good_quantity,defect_quantity,...,lot_no,operator_id,shift,created_at,updated_at,검사 건수,불량 건수,평균 측정값,equipment_name,equipment_type
0,1,INJ-001,BUMPER-A,2024-01-01,2024-01-01 08:14:00,2024-01-01 09:53:32,97,81,77,4,...,LOT2024010100101,OP003,DAY,2026-01-30 00:42:48,2026-01-30 00:42:48,11,4,300.58,사출기 1호기,사출기
1,2,INJ-001,BUMPER-A,2024-01-01,2024-01-01 21:02:00,2024-01-01 22:33:43,83,78,72,6,...,LOT2024010100110,OP006,NIGHT,2026-01-30 00:42:48,2026-01-30 00:42:48,13,6,299.20,사출기 1호기,사출기
2,3,INJ-002,BUMPER-A,2024-01-01,2024-01-01 10:12:00,2024-01-01 13:16:28,149,135,132,3,...,LOT2024010100201,OP001,DAY,2026-01-30 00:42:48,2026-01-30 00:42:48,13,3,298.30,사출기 2호기,사출기
3,4,INJ-002,DASH-C,2024-01-01,2024-01-01 12:48:00,2024-01-01 15:16:31,100,92,90,2,...,LOT2024010100202,OP003,DAY,2026-01-30 00:42:48,2026-01-30 00:42:48,11,2,400.70,사출기 2호기,사출기
4,5,INJ-002,DOOR-B,2024-01-01,2024-01-01 20:48:00,2024-01-01 23:12:13,123,129,122,7,...,LOT2024010100210,OP004,NIGHT,2026-01-30 00:42:48,2026-01-30 00:42:48,17,7,598.72,사출기 2호기,사출기
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
1867,1868,PRESS-002,BUMPER-A,2024-03-31,2024-03-31 20:19:00,2024-03-31 23:25:19,150,144,119,25,...,LOT2024033100210,OP006,NIGHT,2026-01-30 00:42:48,2026-01-30 00:42:48,35,25,298.00,프레스 2호기,프레스
1868,1869,PRESS-002,DASH-C,2024-03-31,2024-04-01 00:15:00,2024-04-01 02:59:58,136,130,109,21,...,LOT2024033100211,OP004,NIGHT,2026-01-30 00:42:48,2026-01-30 00:42:48,31,21,400.62,프레스 2호기,프레스
1869,1870,PRESS-002,BUMPER-A,2024-03-31,2024-04-01 05:53:00,2024-04-01 07:26:15,84,80,66,14,...,LOT2024033100212,OP004,NIGHT,2026-01-30 00:42:48,2026-01-30 00:42:48,20,14,301.14,프레스 2호기,프레스
1870,1871,ASM-001,BUMPER-A,2024-03-31,2024-03-31 10:24:00,2024-03-31 13:25:41,143,121,101,20,...,LOT2024033100101,OP003,DAY,2026-01-30 00:42:48,2026-01-30 00:42:48,30,20,300.78,조립라인 1호기,조립라인


In [408]:
# 4. 설비 타입별 평균 불량률 계산
prod_qual_eq['불량률'] = (prod_qual_eq['불량 건수']/prod_qual_eq['검사 건수']*100).round(2)
prod_qual_eq.groupby('equipment_type').agg({'불량률':'mean'}).round(2)

Unnamed: 0_level_0,불량률
equipment_type,Unnamed: 1_level_1
사출기,49.83
조립라인,56.75
프레스,52.3


In [409]:
# 5. 결과를 설비 타입으로 그룹화하여 출력
prod_qual_eq.groupby('equipment_type').agg({'불량률':'mean'}).round(2)

Unnamed: 0_level_0,불량률
equipment_type,Unnamed: 1_level_1
사출기,49.83
조립라인,56.75
프레스,52.3


---
## 문제 10: 종합 설비 대시보드 데이터 생성

**시나리오**: 모든 데이터를 결합하여 설비별 종합 성능 대시보드를 위한 데이터를 생성하세요.

**요구사항**:
1. 설비별 생산 집계:
   - 생산 건수, 총 생산량, 평균 불량률
2. 설비별 정비 집계:
   - 정비 건수, 총 정비 비용, 평균 정지 시간
3. 설비별 설비운영 집계:
   - RUNNING 시간 합계 (operation_status='RUNNING'인 경우)
4. equipment_df를 기준으로 위 3개 집계를 모두 결합 (left join)
5. 결측치를 0으로 채우기
6. 다음 지표 계산:
   - 가동률 = (RUNNING 시간 / 전체 시간) * 100
   - 생산성 = 총 생산량 / 생산 건수
   - 정비 효율 = 총 생산량 / 정비 건수
7. 종합 점수 = 생산성 * (100 - 평균불량률) / 정비건수
8. 종합 점수 상위 5개 설비 출력

**힌트**: 복잡한 문제. 단계별로 진행, 각 집계 후 reset_index(), 순차적 merge, 계산 컬럼 생성

In [410]:
# 1. 설비별 생산 집계:
#    - 생산 건수, 총 생산량, 평균 불량률
production_df['불량률'] = (production_df['defect_quantity']/production_df['actual_quantity']*100).round(2)
prod_summary = production_df.groupby('equipment_id').agg({'production_id':'count',
                                                          'actual_quantity':'sum',
                                                          '불량률':'mean'}).round(2)
prod_summary.columns = ['생산 건수', '총 생산량', '평균 불량률']
prod_summary

Unnamed: 0_level_0,생산 건수,총 생산량,평균 불량률
equipment_id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
ASM-001,234,22485,12.12
INJ-001,262,28163,10.76
INJ-002,430,51958,8.7
PRESS-001,468,52069,9.91
PRESS-002,478,51929,10.68


In [411]:
# 2. 설비별 정비 집계:
#    - 정비 건수, 총 정비 비용, 평균 정지 시간
main_summary = maintenance_df.groupby('equipment_id').agg({'maintenance_type':'count',
                                                           'cost':'sum',
                                                           'downtime_hours':'mean'}).round(2)
main_summary.columns = ['정비 건수', '총 정비 비용', '평균 정지 시간']
main_summary

Unnamed: 0_level_0,정비 건수,총 정비 비용,평균 정지 시간
equipment_id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
ASM-001,19,60072924.01,1.62
INJ-001,17,45291758.85,1.64
INJ-002,23,75076538.5,1.75
PRESS-001,18,44564270.41,1.65
PRESS-002,21,70921381.6,1.8


In [412]:
# 3. 설비별 설비운영 집계:
#    - RUNNING 시간 합계 (operation_status='RUNNING'인 경우)
operation_df['start_time'] = pd.to_datetime(operation_df['start_time'])
operation_df['end_time'] = pd.to_datetime(operation_df['end_time'])
operation_df['running_time'] = (operation_df['end_time'] - operation_df['start_time'])
op_summary = operation_df[operation_df['operation_status'] == 'RUNNING'].groupby('equipment_id').agg({'running_time':'sum'})
op_summary

Unnamed: 0_level_0,running_time
equipment_id,Unnamed: 1_level_1
ASM-001,43 days 14:35:26
INJ-001,46 days 06:35:40
INJ-002,46 days 07:18:55
PRESS-001,46 days 03:24:28
PRESS-002,45 days 07:12:06


In [413]:
# 4. equipment_id를 기준으로 위 3개 집계를 모두 결합 (left join)
# prod_summary, main_summary, op_summary
op_analysis = pd.merge(prod_summary, main_summary, on='equipment_id', how='left')
op_analysis = pd.merge(op_analysis, op_summary, on='equipment_id', how='left')
op_analysis

Unnamed: 0_level_0,생산 건수,총 생산량,평균 불량률,정비 건수,총 정비 비용,평균 정지 시간,running_time
equipment_id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
ASM-001,234,22485,12.12,19,60072924.01,1.62,43 days 14:35:26
INJ-001,262,28163,10.76,17,45291758.85,1.64,46 days 06:35:40
INJ-002,430,51958,8.7,23,75076538.5,1.75,46 days 07:18:55
PRESS-001,468,52069,9.91,18,44564270.41,1.65,46 days 03:24:28
PRESS-002,478,51929,10.68,21,70921381.6,1.8,45 days 07:12:06


In [414]:
# 5. 결측치를 0으로 채우기
op_analysis.fillna(0)

Unnamed: 0_level_0,생산 건수,총 생산량,평균 불량률,정비 건수,총 정비 비용,평균 정지 시간,running_time
equipment_id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
ASM-001,234,22485,12.12,19,60072924.01,1.62,43 days 14:35:26
INJ-001,262,28163,10.76,17,45291758.85,1.64,46 days 06:35:40
INJ-002,430,51958,8.7,23,75076538.5,1.75,46 days 07:18:55
PRESS-001,468,52069,9.91,18,44564270.41,1.65,46 days 03:24:28
PRESS-002,478,51929,10.68,21,70921381.6,1.8,45 days 07:12:06


In [415]:
# 6. 다음 지표 계산:
#    - 가동률 = (RUNNING 시간 / 전체 시간) * 100
#    - 생산성 = 총 생산량 / 생산 건수
#    - 정비 효율 = 총 생산량 / 정비 건수

production_df['start_time'] = pd.to_datetime(production_df['start_time'])
production_df['end_time'] = pd.to_datetime(production_df['end_time'])
total_t = (production_df.loc[len(production_df)-1, 'end_time'] - production_df.loc[0, 'start_time']).total_seconds()

op_analysis['가동률'] = (op_analysis['running_time'].dt.total_seconds()/total_t*100).round(2)
op_analysis['생산성'] = (op_analysis['총 생산량']/op_analysis['생산 건수']).round(2)
op_analysis['정비 효율'] = (op_analysis['총 생산량']/op_analysis['정비 건수']).round(2)
op_analysis

Unnamed: 0_level_0,생산 건수,총 생산량,평균 불량률,정비 건수,총 정비 비용,평균 정지 시간,running_time,가동률,생산성,정비 효율
equipment_id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1
ASM-001,234,22485,12.12,19,60072924.01,1.62,43 days 14:35:26,48.14,96.09,1183.42
INJ-001,262,28163,10.76,17,45291758.85,1.64,46 days 06:35:40,51.08,107.49,1656.65
INJ-002,430,51958,8.7,23,75076538.5,1.75,46 days 07:18:55,51.12,120.83,2259.04
PRESS-001,468,52069,9.91,18,44564270.41,1.65,46 days 03:24:28,50.94,111.26,2892.72
PRESS-002,478,51929,10.68,21,70921381.6,1.8,45 days 07:12:06,50.01,108.64,2472.81


In [416]:
# 7. 종합 점수 = 생산성 * (100 - 평균불량률) / 정비건수
op_analysis['종합 점수'] = (op_analysis['생산성']*(100 - op_analysis['평균 불량률'])/op_analysis['정비 건수']).round(2)
op_analysis

Unnamed: 0_level_0,생산 건수,총 생산량,평균 불량률,정비 건수,총 정비 비용,평균 정지 시간,running_time,가동률,생산성,정비 효율,종합 점수
equipment_id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1
ASM-001,234,22485,12.12,19,60072924.01,1.62,43 days 14:35:26,48.14,96.09,1183.42,444.44
INJ-001,262,28163,10.76,17,45291758.85,1.64,46 days 06:35:40,51.08,107.49,1656.65,564.26
INJ-002,430,51958,8.7,23,75076538.5,1.75,46 days 07:18:55,51.12,120.83,2259.04,479.64
PRESS-001,468,52069,9.91,18,44564270.41,1.65,46 days 03:24:28,50.94,111.26,2892.72,556.86
PRESS-002,478,51929,10.68,21,70921381.6,1.8,45 days 07:12:06,50.01,108.64,2472.81,462.08


In [417]:
# 8. 종합 점수 상위 5개 설비 출력
op_analysis.sort_values('종합 점수', ascending=False)

Unnamed: 0_level_0,생산 건수,총 생산량,평균 불량률,정비 건수,총 정비 비용,평균 정지 시간,running_time,가동률,생산성,정비 효율,종합 점수
equipment_id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1
INJ-001,262,28163,10.76,17,45291758.85,1.64,46 days 06:35:40,51.08,107.49,1656.65,564.26
PRESS-001,468,52069,9.91,18,44564270.41,1.65,46 days 03:24:28,50.94,111.26,2892.72,556.86
INJ-002,430,51958,8.7,23,75076538.5,1.75,46 days 07:18:55,51.12,120.83,2259.04,479.64
PRESS-002,478,51929,10.68,21,70921381.6,1.8,45 days 07:12:06,50.01,108.64,2472.81,462.08
ASM-001,234,22485,12.12,19,60072924.01,1.62,43 days 14:35:26,48.14,96.09,1183.42,444.44


---
## 수고하셨습니다!

### 학습 체크리스트
- [ ] concat으로 DataFrame 이어붙이기
- [ ] merge의 다양한 join 타입 이해 (inner, left, right, outer)
- [ ] 1:1, 1:N 관계 이해
- [ ] 집계 후 결합하기
- [ ] 순차적 다중 테이블 결합
- [ ] 결합 후 결측치 처리
- [ ] 복합 분석을 위한 데이터 통합
- [ ] 실무 분석 시나리오 적용

