# [ 1-4. 매출 기간인식 코딩 ]

## 1. 데이터베이스에서 데이터 읽어오기

### 1-1. 프롬프트 작성

![프롬프트 작성](image/CH01-04-01.png)

### 1-2. ChatGPT 답변1
![ChatGPT 답변1](image/CH01-04-02.png)
![ChatGPT 답변1](image/CH01-04-03.png)

#### ChatGPT 답변 내용을 connector2.py 파일에 반영
```
# connector3.py
import mysql.connector
import pandas as pd

class Connector:
    def __enter__(self):
        self.connection = mysql.connector.connect(
            host="localhost",
            user="root",
            port="3306",
            password="fastcampus1!",
            database="daily_sales"
        )
        self.cursor = self.connection.cursor(dictionary=True)
        return self

    def __exit__(self, exc_type, exc_val, exc_tb):
        self.cursor.close()
        self.connection.close()

    def insert_data(self, query, data):
        self.cursor.executemany(query, data)
        self.connection.commit()

    def fetch_data(self, query):
        self.cursor.execute(query)
        result = self.cursor.fetchall()
        return pd.DataFrame(result)
```

### 1-3. ChatGPT 답변2
![ChatGPT 답변2](image/CH01-04-04.png)

#### 코드 실행하기

In [7]:
from connector3 import Connector

# daily_sales_raw 테이블의 모든 데이터를 가져오는 쿼리
daily_sales_raw_query = "SELECT * FROM daily_sales_raw"
# original_price 테이블의 모든 데이터를 가져오는 쿼리
original_price_query = "SELECT * FROM original_price"

# 데이터베이스에서 데이터를 DataFrame으로 가져오기
with Connector() as db:
    daily_sales_raw_df = db.fetch_data(daily_sales_raw_query)
    original_price_df = db.fetch_data(original_price_query)

# 가져온 DataFrame 출력 (확인용)
print("daily_sales_raw DataFrame:")
print(daily_sales_raw_df.head())

print("\noriginal_price DataFrame:")
print(original_price_df.head())


daily_sales_raw DataFrame:
   id payment_date  stay_days checkin_date checkout_date  original_price  payment_amount
0   1   2024-01-03          1   2024-01-13    2024-01-14          700000          770000
1   2   2024-01-03          2   2024-01-10    2024-01-12         1000000          980000
2   3   2024-01-03          1   2024-01-03    2024-01-04          500000          540000
3   4   2024-01-03          5   2024-01-07    2024-01-12         2700000         2280000
4   5   2024-01-03          4   2024-01-09    2024-01-13         2200000         2210000

original_price DataFrame:
         date  weekday  original_price
0  2024-01-01        1          500000
1  2024-01-02        2          500000
2  2024-01-03        3          500000
3  2024-01-04        4          500000
4  2024-01-05        5          700000


## 2. 매출 기간인식 계산
### 2-1. 주요 라이브러리 Import 및 설정

In [14]:
import pandas as pd

# DataFrame 출력 옵션 설정
pd.set_option('display.max_rows', 500)
pd.set_option('display.min_rows', 50)
pd.set_option('display.max_columns', 100)
pd.set_option('display.max_colwidth', 100)
pd.set_option('display.width', 300)

### 2-2. 매출인식 코드 작성

#### 2-2-1. 프롬프트 작성
![프롬프트 작성](image/CH01-04-06.png)

#### 2-2-2. ChatGPT 답변
![프롬프트 작성](image/CH01-04-07.png)
![프롬프트 작성](image/CH01-04-08.png)
![프롬프트 작성](image/CH01-04-09.png)
![프롬프트 작성](image/CH01-04-10.png)

#### 2-2-3. ChatGPT 답변 코드
```
import pandas as pd
import numpy as np

# 데이터프레임을 가져오는 코드 (가정)
# from connector import Connector

# # 데이터베이스에서 데이터를 DataFrame으로 가져오기
# with Connector() as db:
#     daily_sales_raw_df = db.fetch_data("SELECT * FROM daily_sales_raw")
#     original_price_df = db.fetch_data("SELECT * FROM original_price")

# 테스트 데이터 생성 (실제 데이터베이스에서 가져온 데이터 사용 시 주석 처리)
# 데이터 형식: 'id', 'payment_date', 'stay_days', 'checkin_date', 'checkout_date', 'original_price', 'payment_amount'
daily_sales_raw_df = pd.DataFrame({
    'id': [1, 2],
    'payment_date': ['2024-01-01', '2024-01-03'],
    'stay_days': [2, 3],
    'checkin_date': ['2024-01-01', '2024-01-03'],
    'checkout_date': ['2024-01-03', '2024-01-06'],
    'original_price': [200000, 300000],
    'payment_amount': [150000, 250000]
})

# 데이터 형식: 'date', 'weekday', 'original_price'
original_price_df = pd.DataFrame({
    'date': ['2024-01-01', '2024-01-02', '2024-01-03', '2024-01-04', '2024-01-05'],
    'weekday': [1, 2, 3, 4, 5],
    'original_price': [100000, 100000, 100000, 100000, 100000]
})

# 날짜 컬럼을 datetime 형식으로 변환
daily_sales_raw_df['checkin_date'] = pd.to_datetime(daily_sales_raw_df['checkin_date'])
daily_sales_raw_df['checkout_date'] = pd.to_datetime(daily_sales_raw_df['checkout_date'])
original_price_df['date'] = pd.to_datetime(original_price_df['date'])

# 최종 결과를 담을 DataFrame 생성
result_df = pd.DataFrame(columns=['date', 'payment_no', 'original_price', 'accrued_revenue'])

# 날짜에 맞춰서 데이터 분배 및 매칭
for _, row in daily_sales_raw_df.iterrows():
    checkin_date = row['checkin_date']
    checkout_date = row['checkout_date']
    payment_amount = row['payment_amount']
    stay_days = (checkout_date - checkin_date).days
    payment_no = row['id']
    
    # 해당 기간의 original_price 가져오기
    period_prices = original_price_df[(original_price_df['date'] >= checkin_date) & (original_price_df['date'] < checkout_date)]
    period_prices = period_prices.copy()
    period_prices['ratio'] = period_prices['original_price'] / period_prices['original_price'].sum()
    
    # payment_amount를 날짜별로 배분
    accrued_revenue = []
    for i, (date, ratio) in enumerate(zip(period_prices['date'], period_prices['ratio'])):
        if i < len(period_prices) - 1:
            accrued_revenue.append(int(payment_amount * ratio))
        else:
            accrued_revenue.append(payment_amount - sum(accrued_revenue))
    
    # 결과 DataFrame에 추가
    temp_df = pd.DataFrame({
        'date': period_prices['date'],
        'payment_no': payment_no,
        'original_price': period_prices['original_price'],
        'accrued_revenue': accrued_revenue
    })
    
    result_df = pd.concat([result_df, temp_df])

# 인덱스 재설정
result_df.reset_index(drop=True, inplace=True)

# 결과 출력
print(result_df)

```

#### 2-2-4. 작성된 코드 적용

In [15]:
import pandas as pd
import numpy as np

# 날짜 컬럼을 datetime 형식으로 변환
daily_sales_raw_df['checkin_date'] = pd.to_datetime(daily_sales_raw_df['checkin_date'])
daily_sales_raw_df['checkout_date'] = pd.to_datetime(daily_sales_raw_df['checkout_date'])
original_price_df['date'] = pd.to_datetime(original_price_df['date'])

# 최종 결과를 담을 DataFrame 생성
result_df = pd.DataFrame(columns=['date', 'payment_no', 'original_price', 'accrued_revenue'])

# 날짜에 맞춰서 데이터 분배 및 매칭
for _, row in daily_sales_raw_df.iterrows():
    checkin_date = row['checkin_date']
    checkout_date = row['checkout_date']
    payment_amount = row['payment_amount']
    stay_days = (checkout_date - checkin_date).days
    payment_no = row['id']
    
    # 해당 기간의 original_price 가져오기
    period_prices = original_price_df[(original_price_df['date'] >= checkin_date) & (original_price_df['date'] < checkout_date)]
    period_prices = period_prices.copy()
    period_prices['ratio'] = period_prices['original_price'] / period_prices['original_price'].sum()
    
    # payment_amount를 날짜별로 배분
    accrued_revenue = []
    for i, (date, ratio) in enumerate(zip(period_prices['date'], period_prices['ratio'])):
        if i < len(period_prices) - 1:
            accrued_revenue.append(int(payment_amount * ratio))
        else:
            accrued_revenue.append(payment_amount - sum(accrued_revenue))
    
    # 결과 DataFrame에 추가
    temp_df = pd.DataFrame({
        'date': period_prices['date'],
        'payment_no': payment_no,
        'original_price': period_prices['original_price'],
        'accrued_revenue': accrued_revenue
    })
    
    result_df = pd.concat([result_df, temp_df])

# 인덱스 재설정
result_df.reset_index(drop=True, inplace=True)

# 결과 출력
print(result_df)


  result_df = pd.concat([result_df, temp_df])


           date payment_no original_price accrued_revenue
0    2024-01-13          1         700000          770000
1    2024-01-10          2         500000          490000
2    2024-01-11          2         500000          490000
3    2024-01-03          3         500000          540000
4    2024-01-07          4         700000          591111
5    2024-01-08          4         500000          422222
6    2024-01-09          4         500000          422222
7    2024-01-10          4         500000          422222
8    2024-01-11          4         500000          422223
9    2024-01-09          5         500000          502272
10   2024-01-10          5         500000          502272
11   2024-01-11          5         500000          502272
12   2024-01-12          5         700000          703184
13   2024-01-04          6         500000          441666
14   2024-01-05          6         700000          618333
15   2024-01-06          6         700000          618333
16   2024-01-0

## 3. 계산결과 검토

### 3-1. 프롬프트 작성
![프롬프트 작성](image/CH01-04-11.png)

### 3-2. ChatGPT 답변
![ChatGPT 답변](image/CH01-04-12.png)
![ChatGPT 답변](image/CH01-04-13.png)

In [16]:
# accrued_revenue를 payment_no 기준으로 합계 계산
accrued_revenue_sum = result_df.groupby('payment_no')['accrued_revenue'].sum().reset_index()

# daily_sales_raw_df에서 id와 payment_amount 선택
payment_amount_df = daily_sales_raw_df[['id', 'payment_amount']]

# payment_no와 payment_amount를 합치기 위해 컬럼 이름 변경
accrued_revenue_sum.rename(columns={'payment_no': 'id'}, inplace=True)

# payment_amount_df와 accrued_revenue_sum을 병합
comparison_df = pd.merge(payment_amount_df, accrued_revenue_sum, on='id', how='left')

# payment_amount와 accrued_revenue 합계를 비교
comparison_df['is_correct'] = comparison_df['payment_amount'] == comparison_df['accrued_revenue']

# 결과 출력
print(comparison_df)

# 최종 확인
if comparison_df['is_correct'].all():
    print("All accrued_revenue calculations are correct.")
else:
    print("Some accrued_revenue calculations are incorrect.")


        id  payment_amount accrued_revenue  is_correct
0        1          770000          770000        True
1        2          980000          980000        True
2        3          540000          540000        True
3        4         2280000         2280000        True
4        5         2210000         2210000        True
5        6         3180000         3180000        True
6        7         3370000         3370000        True
7        8          930000          930000        True
8        9         1550000         1550000        True
9       10         2220000         2220000        True
10      11         2510000         2510000        True
11      12         3430000         3430000        True
12      13         2190000         2190000        True
13      14          780000          780000        True
14      15         3860000         3860000        True
15      16         1460000         1460000        True
16      17          740000          740000        True
17      18