# Objective
The objective of this assignment is to read data from popular file formats (.csv, .txt, .json), to perform operations on this data, and to finally save this modified data into the original file formats.

# Pipeline that needs to be followed

The overall objective of this project is to create a system for managing product information in an e-commerce platform. The different stages involved in the process are outlined below:

### 1. Set up project and load data

&emsp;**1.1** Import required libraries  
&emsp;**1.2** Load the data

### 2. Create or update data

&emsp;**2.1** Add or update sales data  
&emsp;**2.2** Add or update product details  
&emsp;**2.3** Add or update product description  
&emsp;**2.4** Update function

### 3. Save data to disk

&emsp;**3.1** Save data to disk


## **1.**  Set up project and load data  <font color = red>[15 marks]</font>

In this stage, you will set up the environment for this assignment by loading the required modules and files. You will explore the files by displaying their content.

### **1.1** - Import required modules  <font color = red>[5 marks]</font>

### Description
In this task, you will import all the necessary modules and packages required for performing various operations in the project.

In [2]:
# Use this cell to import all the required packages and methods

# Import package for navigating through files stored on your device/on Google Colaboratory
### CODE HERE ###
# We use os or pathlib for local device navigation
from pathlib import Path

# Import package for working with JSON files
### CODE HERE ###
import json

# Import package for working with CSV files
### CODE HERE ###
import csv
import os

### **1.2** Load the data  <font color = red>[10 marks]</font>

### Description
In this task, you will write a function that ensures that the necessary files are loaded into the environment. To index the data, you will use a unique identifier called SKU.

This includes loading sales data from a CSV file, product details from JSON files, and product descriptions from text files. We recommend that you either use Jupyter Notebook or Google Colab to build and execute your code.

First, if you are using Google Colab, mount Google Drive to your VM. If not, skip and comment out this cell.

In [None]:
# Use this cell to write your code for mounting your Google Drive
# Note: If you are not using Google Colab, please skip this cell

# In case you are using Google Colab, mount your Google Drive before moving on
# from google.colab import drive
# drive.mount('/content/drive', force_remount = True)

If you are using Colab, after mounting the drive you need to unzip the files to extract all the images inside it. Note that you don't need to perform this step more than once, so we recommend that you comment out your code for this step once it has executed.

In [None]:
# Use this cell to write your code for unzipping the data and storing it in Google Drive
# Note: If you are not using Google Colab, please skip this cell
# Note: You can comment out this cell after running it once

# Unzip your files and store them in your drive
# !unzip /content/drive/MyDrive/File_Handling_Project/mainfolder.zip

**Alternatively,** you can also upload files to the Google Colab runtime environment without mounting Google Drive. In this case so you will always be in the same path/directory inside your Google Colab runtime. Files will be saved into your runtime and not into your Google Drive.
The files you uploaded will be available until you delete the runtime.

In [1]:
# Use this cell to write your code for uploading the zip file
# Note: If you are not using Google Colab, please skip this cell

# Upload the zip file to Google Colab runtime
#from google.colab import files
#uploaded = files.upload()

After uploading your zip file to Google Colab runtime you need to unzip the files to extract all the files inside it.

In [None]:
# Use this cell to write your code for unzipping the data and storing it in Google Colab runtime
# Note: If you are not using Google Colab, please skip this cell
# Note: You can comment out this cell after running it once

# Unzip your files and store them in Google Colab runtime
#!unzip /content/mainfolder.zip

Now define the *load_data()* function.

In [3]:
def load_data(main_folder):
    sales_data_list = []
    product_details_dict = {}
    product_descriptions_dict = {}

    # Load sales data from CSV
    sales_file_path = os.path.join(main_folder, 'sales_data.csv')
    try:
        with open(sales_file_path, 'r') as csvfile:
            reader = csv.DictReader(csvfile)
            for row in reader:
                sales_data_list.append(row)
    except FileNotFoundError:
        print(f"Sales data file not found at: {sales_file_path}")

    # Assuming product details are in individual JSON files within 'product_details' subfolder
    # and product descriptions are in individual TXT files within 'product_descriptions' subfolder
    product_details_folder = os.path.join(main_folder, 'product_details')
    product_descriptions_folder = os.path.join(main_folder, 'product_descriptions')

    if os.path.exists(product_details_folder):
        for filename in os.listdir(product_details_folder):
            if filename.endswith('.json'):
                sku = Path(filename).stem
                file_path = os.path.join(product_details_folder, filename)
                try:
                    with open(file_path, 'r') as f:
                        product_details_dict[sku] = json.load(f)
                except Exception as e:
                    print(f"Error loading product details for {sku}: {e}")

    if os.path.exists(product_descriptions_folder):
        for filename in os.listdir(product_descriptions_folder):
            if filename.endswith('.txt'):
                sku = Path(filename).stem
                file_path = os.path.join(product_descriptions_folder, filename)
                try:
                    with open(file_path, 'r') as f:
                        product_descriptions_dict[sku] = f.read().strip()
                except Exception as e:
                    print(f"Error loading product description for {sku}: {e}")

    return product_details_dict, sales_data_list, product_descriptions_dict

Load your data here

In [4]:
# Use this cell to load the files
main_folder_address = '/content/main_folder/'
product_details, sales_data, product_descriptions = load_data(main_folder_address)

Sales data file not found at: /content/main_folder/sales_data.csv


## **2.** Update data  <font color = red>[25 marks]</font>
In this stage, you will define a function `update()` to add sales data, product details, and product descriptions for a new product or update an existing product. If the product does not exist, the function will default to creating a new product. If the product exists, the function will instead update that product. You will also define some sub-functions to complete smaller tasks.

### **2.1** Update sales data  <font color = red>[5 marks]</font>

### Description
In this task, you will write a function to add sales data for a new product or update sales data for an existing product given the SKU and the quantities that need to be added or updated.

In [27]:
def update_sales_data(sales_data, sku, quantities):
    # Check if the SKU already exists
    sku_exists = False
    for item in sales_data:
        if item['Product_SKU'] == sku:
            # Update existing quantities
            for day, quantity in quantities.items():
                item[day] = str(quantity) # Ensure quantity is stored as string
            sku_exists = True
            break

    # If SKU does not exist, add a new entry
    if not sku_exists:
        new_sales_entry = {'Product_SKU': sku}
        for day, quantity in quantities.items():
            new_sales_entry[day] = str(quantity) # Ensure quantity is stored as string
        sales_data.append(new_sales_entry)

    return sales_data

Check your code here.

In [28]:
sales_data = update_sales_data(sales_data,
                                'NEW_PRODUCT_1',
                                {'Day1': 5, 'Day2': 7, 'Day3': 10, 'Day4': 8, 'Day5': 12,
                                 'Day6': 15, 'Day7': 11, 'Day8': 9, 'Day9': 10, 'Day10': 13,
                                 'Day11': 16, 'Day12': 14, 'Day13': 18, 'Day14': 20})
sales_data

[{'Product_SKU': 'AISJDKFJW93NJ',
  'Day1': '10',
  'Day2': '12',
  'Day3': '15',
  'Day4': '18',
  'Day5': '20',
  'Day6': '22',
  'Day7': '25',
  'Day8': '28',
  'Day9': '26',
  'Day10': '30',
  'Day11': '32',
  'Day12': '29',
  'Day13': '27',
  'Day14': '24'},
 {'Product_SKU': 'DJKFIEI432FIE',
  'Day1': '8',
  'Day2': '10',
  'Day3': '12',
  'Day4': '15',
  'Day5': '20',
  'Day6': '18',
  'Day7': '14',
  'Day8': '13',
  'Day9': '17',
  'Day10': '10',
  'Day11': '8',
  'Day12': '11',
  'Day13': '14',
  'Day14': '16'},
 {'Product_SKU': 'GGOENEBJ079499',
  'Day1': '15',
  'Day2': '18',
  'Day3': '22',
  'Day4': '25',
  'Day5': '28',
  'Day6': '20',
  'Day7': '17',
  'Day8': '23',
  'Day9': '19',
  'Day10': '21',
  'Day11': '24',
  'Day12': '27',
  'Day13': '18',
  'Day14': '20'},
 {'Product_SKU': 'HJSKNWK429DJE',
  'Day1': '30',
  'Day2': '32',
  'Day3': '35',
  'Day4': '38',
  'Day5': '40',
  'Day6': '42',
  'Day7': '45',
  'Day8': '48',
  'Day9': '50',
  'Day10': '52',
  'Day11': '55

### **2.2** Update product details  <font color = red>[5 marks]</font>

### Description
In this task, you will write a function to add product details for a new product or update product details for an existing product using the product SKU.

In [29]:
def update_product_details(product_details, sku, product_info):
    product_details[sku] = product_info
    return product_details

Check your code here.

In [30]:
product_details = update_product_details(product_details,
                                          'NEW_PRODUCT_1',
                                          {'product_name': 'New Gadget',
                                           'brand': 'InnovativeTech',
                                           'model': 'IT-X1',
                                           'specifications': 'Wireless, long-lasting battery, waterproof',
                                           'price': '$99.99',
                                           'availability': 'In stock'})
product_details

{'details_OWEJL398FWJLK': {'product_name': 'Yoga Mat',
  'brand': 'ZenFitness',
  'model': 'EcoMat-500',
  'specifications': 'Non-slip, 6mm thickness, Eco-friendly material',
  'price': '$19.99',
  'availability': 'In stock'},
 'details_LKDFJ49LSDJKL': {'product_name': 'Anti-Aging Face Cream',
  'brand': 'GlowBeauty',
  'model': 'AgeDefy-300',
  'specifications': 'Natural ingredients, Hydrating formula',
  'price': '$39.99',
  'availability': 'In stock'},
 'details_NEKFJOWE9FDIW': {'product_name': 'Board Game',
  'brand': 'FamilyFun',
  'model': 'GameNight-2022',
  'specifications': '2-6 players, Ages 8 and up',
  'price': '$29.99',
  'availability': 'In stock'},
 'details_AISJDKFJW93NJ': {'product_name': 'Wall Art Print',
  'brand': 'ArtCraft',
  'model': 'NatureCanvas-1001',
  'specifications': 'Canvas print, Ready to hang',
  'price': '$49.99',
  'availability': 'In stock'},
 'details_DJKFIEI432FIE': {'product_name': "Men's Running Shoes",
  'brand': 'RunFit',
  'model': 'SpeedX-500

### **2.3** Update product description  <font color = red>[5 marks]</font>

### Description
In this task, you will write a function to add a product description for the new product using its product SKU.

In [31]:
def update_product_description(product_descriptions, sku, description):
    product_descriptions[sku] = description
    return product_descriptions

Check your code here.

In [32]:
product_descriptions = update_product_description(product_descriptions,
                                                  'NEW_PRODUCT_1',
                                                  'This innovative new gadget offers unparalleled performance and durability. Perfect for all your daily needs, with advanced features and a sleek design.')
product_descriptions

{'description_LKDFJ49LSDJKL': "Rediscover youthful radiance with GlowBeauty's AgeDefy-300 Anti-Aging Face Cream.\nFormulated with natural ingredients and a hydrating formula, this skincare essential rejuvenates and nourishes your skin, leaving you with a vibrant and refreshed complexion.\nWith a stellar 4.7/5 stars rating, it's a must-have for those embracing the journey to ageless beauty.",
 'description_NEKFJOWE9FDIW': "Unleash the fun with FamilyFun's GameNight-2022 Board Game.\nDesigned for 2-6 players and suitable for ages 8 and up, this exciting game promises laughter and bonding moments for the entire family.\nWith a 4.4/5 stars rating, it's a testament to its ability to turn any ordinary night into an extraordinary game night filled with friendly competition and shared joy.",
 'description_GGOENEBJ079499': 'Dive into the future with the XYZ Electronics Smartphone, model ABC-2000.\nBoasting a 6.5-inch display, 128GB storage, and a 16MP camera, this powerful device redefines the 

### **2.4** Update function  <font color = red>[10 marks]</font>

### Description
In this task, you will write a function that combines the functionalities of adding sales data, product details, and product description for a new product SKU, or updating these for an existing product SKU.

In [33]:
def update(product_details, sales_data, product_descriptions, sku, quantities, product_info, description):
    sales_data = update_sales_data(sales_data, sku, quantities)
    product_details = update_product_details(product_details, sku, product_info)
    product_descriptions = update_product_description(product_descriptions, sku, description)

    return product_details, sales_data, product_descriptions

Check your code here.

In [34]:
product_details, sales_data, product_descriptions = update(
    product_details,
    sales_data,
    product_descriptions,
    'NEW_PRODUCT_2', # Example SKU for testing the combined update
    {'Day1': 20, 'Day2': 22, 'Day3': 25, 'Day4': 23, 'Day5': 28,
     'Day6': 30, 'Day7': 27, 'Day8': 26, 'Day9': 29, 'Day10': 31,
     'Day11': 33, 'Day12': 30, 'Day13': 28, 'Day14': 32}, # Sample quantities
    {'product_name': 'Super Widget', # Sample product_info
     'brand': 'MegaCorp',
     'model': 'MW-5000',
     'specifications': 'Advanced features, durable, sleek design',
     'price': '$199.99',
     'availability': 'Limited stock'},
    'A fantastic new widget from MegaCorp, offering cutting-edge technology and unparalleled user experience.' # Sample description
)

print("\n--- Updated Product Details ---")
print(product_details.get('NEW_PRODUCT_2'))
print("\n--- Updated Sales Data ---")
# Find the new product's sales data
new_product_sales = next((item for item in sales_data if item['Product_SKU'] == 'NEW_PRODUCT_2'), None)
print(new_product_sales)
print("\n--- Updated Product Descriptions ---")
print(product_descriptions.get('NEW_PRODUCT_2'))


--- Updated Product Details ---
{'product_name': 'Super Widget', 'brand': 'MegaCorp', 'model': 'MW-5000', 'specifications': 'Advanced features, durable, sleek design', 'price': '$199.99', 'availability': 'Limited stock'}

--- Updated Sales Data ---
{'Product_SKU': 'NEW_PRODUCT_2', 'Day1': '20', 'Day2': '22', 'Day3': '25', 'Day4': '23', 'Day5': '28', 'Day6': '30', 'Day7': '27', 'Day8': '26', 'Day9': '29', 'Day10': '31', 'Day11': '33', 'Day12': '30', 'Day13': '28', 'Day14': '32'}

--- Updated Product Descriptions ---
A fantastic new widget from MegaCorp, offering cutting-edge technology and unparalleled user experience.


## **3.** Save data to disk  <font color = red>[10 marks]</font>

In the this stage, learners are tasked with creating a `dump_data()` function which will allow the newly modified files to be saved in their corresponding file formats: CSV for sales data, JSON for product details, and plain text (.txt) for product descriptions.



### **3.1** Save data to disk  <font color = red>[10 marks]</font>

### Description
In this task, learners are tasked with implementing a Python function named `dump_data()` that automates the process of persisting sales data, product details, and product descriptions into structured files within a specified directory. The function should efficiently organize and dump each type of data into its corresponding file format: CSV for sales data, JSON for product details, and plain text for product descriptions. This exercise challenges learners to apply file I/O operations, directory management, and data serialization techniques in Python, ensuring they gain practical experience with data persistence, manipulation, and organization on the filesystem.

In [35]:
def dump_data(sales_data, product_details, product_descriptions, main_folder):
    # Ensure main folder and subfolders exist
    os.makedirs(main_folder, exist_ok=True)
    product_details_folder = os.path.join(main_folder, 'product_details')
    product_descriptions_folder = os.path.join(main_folder, 'product_descriptions')
    os.makedirs(product_details_folder, exist_ok=True)
    os.makedirs(product_descriptions_folder, exist_ok=True)

    # Dump sales data to CSV
    sales_file_path = os.path.join(main_folder, 'sales_data.csv')
    if sales_data:
        with open(sales_file_path, 'w', newline='') as csvfile:
            # Get fieldnames from the first item, assuming all items have the same keys
            fieldnames = sales_data[0].keys()
            writer = csv.DictWriter(csvfile, fieldnames=fieldnames)
            writer.writeheader()
            writer.writerows(sales_data)

    # Dump product details to JSON files
    for sku, details in product_details.items():
        # Ensure consistent SKU naming for file storage
        if not sku.startswith('details_'):
            file_sku = f"details_{sku}"
        else:
            file_sku = sku
        file_path = os.path.join(product_details_folder, f'{file_sku}.json')
        with open(file_path, 'w') as f:
            json.dump(details, f, indent=4)

    # Dump product descriptions to TXT files
    for sku, description in product_descriptions.items():
        # Ensure consistent SKU naming for file storage
        if not sku.startswith('description_'):
            file_sku = f"description_{sku}"
        else:
            file_sku = sku
        file_path = os.path.join(product_descriptions_folder, f'{file_sku}.txt')
        with open(file_path, 'w') as f:
            f.write(description)

    print("Data successfully dumped to disk.")

Check your function here.

In [36]:
dump_data(sales_data, product_details, product_descriptions, main_folder_address)

Data successfully dumped to disk.


You will notice that *mainfolder* now has new files in the product descriptions/details subfolders, as well as new rows in *sales_data.csv* corresponding to the products that you created in stage 2, and while checking your code.