# Problem - Wrangling csv Files

You are working as a consultant for an environmental analytics firm. They’ve used temperature loggers to monitor the temperature in a number of local streams. The data from each logger has already been downloaded to a separate csv file. In reality, there are hundreds of files from several different streams. In the \logs folder, you’ll see that I’ve provided just a sample of 14 files, all of which are from one stream – BCM. One good thing is that the firm used structured filenames so that it’s clear exactly where the data came from for each logger. 

**Task1 :** Combine all of the csv files in a single Excel workbook, Give it name and header in the file.

**Task2:**  Summarization (Min/Max/Average)

**Create a blank workbook and save it as BCM.xlsx**

In [1]:
from openpyxl import Workbook
workbook = Workbook()
workbook.save("BCM.xlsx")

**Insert the Contents of Each CSV into a New Sheet and Giving Headers to file:**

In [2]:
import pandas as pd
from pathlib import Path

# Define the path to the logs folder
logs_path = Path('logs')

# Create a writer object for the Excel file
with pd.ExcelWriter('BCM.xlsx', engine='openpyxl', mode='a') as writer:
    for csv_file in logs_path.glob('*.csv'):
        # Read the CSV file into a DataFrame
        df = pd.read_csv(csv_file, header=None, names=['datetime', 'scale', 'temperature'])
        
        # Write the DataFrame to a sheet named after the CSV file (without the .csv extension)
        sheet_name = csv_file.stem
        df.to_excel(writer, sheet_name=sheet_name, index=False)


**Create Function for datetime format and add Formulas to Each Sheet to calculate (Min/Max/Average):**

In [3]:
from openpyxl import load_workbook
from openpyxl.utils import get_column_letter
from datetime import datetime

# Function to convert string to desired Excel datetime format
def convert_to_excel_datetime(date_str):
    try:
        dt = datetime.strptime(date_str, "%m/%d/%Y %H:%M")
        return dt    
    except ValueError:
        return None

# Step 4: Add formulas to each sheet
workbook = load_workbook('BCM.xlsx')


for sheet_name in workbook.sheetnames:
    if sheet_name != 'Sheet':  # Skip the initial blank sheet
        sheet = workbook[sheet_name]
        max_row = sheet.max_row

        # Convert datetime strings to Excel datetime objects
        for row in range(2, max_row + 1):
            cell_value = sheet[f'A{row}'].value
            excel_datetime = convert_to_excel_datetime(cell_value)
            if excel_datetime:
                sheet[f'A{row}'].value = excel_datetime
        
        # Add labels for the formulas
        sheet['G2'] = 'Min Temperature'
        sheet['G3'] = 'Max Temperature'
        sheet['G4'] = 'Average Temperature'
        sheet['G6'] = 'Min Datetime'
        sheet['G7'] = 'Max Datetime'

        # Add the formulas
        sheet['H2'] = f"=round(MIN(C2:C{max_row}), 1)"
        sheet['H3'] = f"=round(MAX(C2:C{max_row}), 1)"
        sheet['H4'] = f"=round(AVERAGE(C2:C{max_row}), 1)"
        
       
        sheet['H6'] = f"=TEXT(MIN(A2:A{max_row}), \"yyyy-mm-dd hh:mm:ss\")"
        sheet['H7'] = f"=TEXT(MAX(A2:A{max_row}), \"yyyy-mm-dd hh:mm:ss\")"
        
        sheet.column_dimensions['A'].width = 18
        sheet.column_dimensions['G'].width = 18
        sheet.column_dimensions['H'].width = 18

# Save the workbook with the formulas
workbook.save('BCM.xlsx')
