**Documentation of Producing and Implementing**

#### Documentation: Test 1

In [None]:
import pandas as pd

# Load the dataset
df = pd.read_csv('cars_2010_2020.csv')
# New Function called Sorting_Make, Sorting the Make Column by Alphabetical Order
def Sorting_Make():
    global df

df = df.sort_values('Make')

Sorting_Make()


# New Function called Sorting_Model, Sorting the Model Column by Alphabetical Order
def Sorting_Model():
    global df

df = df.sort_values('Model')

Sorting_Model()
# Print the final dataframe
print(df)


**Evaluation**

- Date: 12/8/24

- Adjustments/Changes: As the first test, I created this code for finding the sorted make and model of each car. Originally I planned to find the year, however it was needless for this stage as I was only finding the mean for each make.

- Did it Work: This worked, and it prints the full dataframe starting from BMW (because it is in alphabetical order), and sorts from alphabetical order for the model as well, starting from 3 Series.

- What do I do next: Following my data dictionary, I now have to implement the data analysis stage, where I find the mean for each make and model, and finally visualise by year in the dataframe (for cleanliness) and the following I have already accomplished.

#### Documentation: Test 2

In [None]:
import pandas as pd
import matplotlib.pyplot as plt

# Load the dataset
df = pd.read_csv('cars_2010_2020.csv')

# Calculate the average price for each make, take the columns from the original csv file
average_make = df.groupby(['Make'])['Price (USD)'].mean().reset_index()
average_make.columns = ['Make', 'Average_Price']

# Round to 2 decimal places
average_make['Average_Price'] = average_make['Average_Price'].round(2)
print(average_make)

# Calculate the price for each model, take the columns from the original csv file
average_model = df.groupby(['Make', 'Model', 'Year', 'Engine Size (L)', 'Fuel Type'])['Price (USD)'].mean().reset_index()
average_model.columns = ['Make', 'Model', 'Year', 'Engine Size (L)', 'Fuel Type', 'Price']

# Round to 2 decimal places
average_model['Price'] = average_model['Price'].round(2)
print(average_model)

#Do the same, but for plotting (Other DataFrame Holds Too Much Information Due to Extra Columns)
average_makeplot = df.groupby(['Make'])['Price (USD)'].mean().reset_index()
average_makeplot.columns = ['Make', 'Average_MakePlot']

#Do the same, but for plotting (Other DataFrame Holds Too Much Information Due to Extra Columns)
average_modelplot = df.groupby(['Model'])['Price (USD)'].mean().reset_index()
average_modelplot.columns = ['Model', 'Average_ModelPlot']

# Plot the average price for each make
average_makeplot.plot(
    kind='bar',
    x='Make',
    y='Average_MakePlot',
    color='blue',
    alpha=0.3,
    title="Comparison of Average Make Prices"
)



# Plot the average price for each model
average_modelplot.plot(
    kind='bar',
    x='Model',
    y='Average_ModelPlot',
    color='blue',
    alpha=0.3,
    title="Comparison of Average Model Prices"
)

plt.show()

# Storing only model because we need each individual car with all attributes. The other dataframe is simply for the graph and for finding what is the cheapest average make.
average_model.to_csv('Average_Model.csv', index=False)

**Evaluation**

- Date: 13/8/24-14/8/24

- Adjustments/Changes: Looking back from documentation 1, there was a huge change. What I did instead of sorting the file from the original csv file, was to make 2 seperate dataframes which one was stored into a csv file later on from the original csv file that found the average make, and model pricing. The other dataframe was only used for graphs and comparison. I then put these dataframes into matplotlib and have suitable graphs for the GUI.

- Did it Work: It did, by using the groupby function in pandas allowing me to take certain column aspects and calculate whatever I needed for each. Then, I made it into a new dataframe which the user can access, and visualised and stored it into a new csv file with every single car in alphabetical order.

- What do I do next: As I finished Data Analysis, Visualisation and Reporting, the next step was creating a GUI for the user to access. This would include 3 panels for the dataframe, 2 matplotlib files with both the make and the model prices, and that would all be used with this new csv file called Average_Model.csv.

#### Documentation: Test 3

In [None]:
import pandas as pd
import matplotlib.pyplot as plt
# This is a new line used for integrating matplotlib and tkinter together. 
from matplotlib.backends.backend_tkagg import FigureCanvasTkAgg
import tkinter as tk
from tkinter import ttk

# Load the data
df = pd.read_csv('Average_Model.csv')

# Calculate the average price for each make, take the columns from the original csv file (Reused from previous Documentation)
avg_price_by_make = df.groupby(['Make'])['Price'].mean().reset_index()
avg_price_by_make.columns = ['Make', 'Average_Price']

# Round to 2 decimal places (Reused)
avg_price_by_make['Average_Price'] = avg_price_by_make['Average_Price'].round(2)

# Calculate the price for each model, take the columns from the original csv file (Reused from previous documentation)
avg_price_by_model = df.groupby(['Model'])['Price'].mean().reset_index()
avg_price_by_model.columns = ['Model', 'Price']

# Round to 2 decimal places (Reused)
avg_price_by_model['Price'] = avg_price_by_model['Price'].round(2)


# Create the main window
root = tk.Tk()
root.title("Average Prices of Cars from 2010 to 2020 (All the information you would ever need!)")
root.geometry("900x600")
root.config(bg="skyblue")

# Create a notebook widget
carsnotebook = ttk.Notebook(root)
carsnotebook.pack(fill='both', expand=True)

# Create frames for each tab
makeframe = ttk.Frame(carsnotebook)
modelframe = ttk.Frame(carsnotebook)
dataframe = ttk.Frame(carsnotebook)

# Add frames to notebook
carsnotebook.add(makeframe, text='Make Prices')
carsnotebook.add(modelframe, text='Model Prices')
carsnotebook.add(dataframe, text='DataFrame')

# Inserting Pandas Dataframe into the GUI: First Convert to String, create a text widget and insert it into the widget. 
def data_make():
    df_string = df.to_string()
    text_widget = tk.Text(dataframe)
    text_widget.insert(tk.END, df_string)
    text_widget.pack(side=tk.TOP, fill=tk.BOTH, expand=1)

# Function to create the bar plot for Make (Reused from previous Documentation)
def plot_make():
# Create a figure and axes object to plot
    fig, ax = plt.subplots()
    avg_price_by_make.plot(
        kind='bar',
        x='Make',
        y='Average_Price',
        color='blue',
        alpha=0.3,
        ax=ax,
        title="Comparison of Average Make Prices"
    )
# Allows for placing the matplotlib into a frame, which is then drawn and packed into the frame properly.
    canvas = FigureCanvasTkAgg(fig, master=makeframe)
    canvas.draw()
    canvas.get_tk_widget().pack(side=tk.TOP, fill=tk.BOTH, expand=1)

# Function to create the bar plot for Model (Reused from previous Documentation)
def plot_model():
# Create a figure and axes object to plot
    fig, ax = plt.subplots()
    avg_price_by_model.plot(
        kind='bar',
        x='Model',
        y='Price',
        color='blue',
        alpha=0.3,
        ax=ax,
        title="Comparison of Average Model Prices"
    )
# Allows for placing the matplotlib into a frame, which is then drawn and packed into the frame properly.
    canvas = FigureCanvasTkAgg(fig, master=modelframe)
    canvas.draw()
    canvas.get_tk_widget().pack(side=tk.TOP, fill=tk.BOTH, expand=1)

# Create buttons to reveal the plots and the dataframe (Calling the function for revealing the graphs and the dataframe)
Plot1_Make = tk.Button(makeframe, text="Plot Make Prices", command=plot_make)
Plot1_Make.pack(pady=10)

Plot2_Model = tk.Button(modelframe, text="Plot Model Prices", command=plot_model)
Plot2_Model.pack(pady=10)

DataFrame_Model = tk.Button(dataframe, text="Show DataFrame", command=data_make)
DataFrame_Model.pack(pady=10)

# Finally reveal the GUI
root.mainloop()


**Evaluation**

- Date: 14/8/24 - 15/8/24

- Adjustments/Changes: Here was the finished GUI, using reused code from Documentation 2 as shown, and using Tkinter to visualise the GUI. This mainly used notebook widgets, which I found a tutorial from here: https://www.pythontutorial.net/tkinter/tkinter-notebook/. This required me to import ttk from Tkinter. Then, to intialise matplotlib into tkinter, I used the 3rd line shown, found from this website: https://pythonprogramming.net/how-to-embed-matplotlib-graph-tkinter-gui/. The rest of the coding was done through the original link given to make buttons: https://www.pythonguis.com/tutorials/create-buttons-in-tkinter/, and the original code from documentation 2. 

- Did it Work: This ultimately worked after 2 days, one of the reasons why this would not work was because of forgetting to pack the matplotlib into the frame and not knowing to which was ultimately the hardest part for me in making a GUI putting matplotlib inside, especially this line was hard to understand: fig, ax = plt.subplots() and  canvas = FigureCanvasTkAgg(fig, master=modelframe) canvas.draw() canvas.get_tk_widget().pack(side=tk.TOP, fill=tk.BOTH, expand=1). But after the tutorial it helped much more. Another basic reason was because of bad indenting, and so on and so forth.

- What do I do next: As I finished the GUI, a suggestion would be to add loops to make it much more clear and concise, as there is a lot of redundancy and repetition in the code. But overall, this would be my final product and a success of using Tkinter and making a functioning GUI.

#### Documentation: Test 4

In [None]:
import pandas as pd
import matplotlib.pyplot as plt
from matplotlib.backends.backend_tkagg import FigureCanvasTkAgg
import tkinter as tk
from tkinter import ttk

# Load the data
df = pd.read_csv('Average_Model.csv')

# Calculate the average price for each make, take the columns from the original csv file (Reused from previous Documentation)
avg_price_by_make = df.groupby(['Make'])['Price'].mean().reset_index()
avg_price_by_make.columns = ['Make', 'Average_Price']

# Round to 2 decimal places (Reused)
avg_price_by_make['Average_Price'] = avg_price_by_make['Average_Price'].round(2)

# Calculate the price for each model, take the columns from the original csv file (Reused from previous documentation)
avg_price_by_model = df.groupby(['Model'])['Price'].mean().reset_index()
avg_price_by_model.columns = ['Model', 'Price']

# Round to 2 decimal places (Reused)
avg_price_by_model['Price'] = avg_price_by_model['Price'].round(2)


# Create the main window (Reused)
root = tk.Tk()
root.title("Average Prices GUI")
root.geometry("900x600")
root.config(bg="skyblue")

# Create a notebook widget
carsnotebook = ttk.Notebook(root)
carsnotebook.pack(fill='both', expand=True)

# Create frames for each tab
makeframe = ttk.Frame(carsnotebook)
modelframe = ttk.Frame(carsnotebook)
dataframe = ttk.Frame(carsnotebook)
helpframe = ttk.Frame(carsnotebook)

# Add frames to notebook
carsnotebook.add(makeframe, text='Make Prices')
carsnotebook.add(modelframe, text='Model Prices')
carsnotebook.add(dataframe, text='DataFrame')
carsnotebook.add(helpframe, text = 'Help')

# Inserting Pandas Dataframe into the GUI: First Convert to String, create a text widget and insert it into the widget. 
def data_make():
    df_string = df.to_string()
    text_widget = tk.Text(dataframe)
    text_widget.insert(tk.END, df_string)
    text_widget.pack(side=tk.TOP, fill=tk.BOTH, expand=1)

# Function to create the bar plot for Make (Reused from previous Documentation)
def plot_make():
# Create a figure and axes object to plot
    fig, ax = plt.subplots()
    avg_price_by_make.plot(
        kind='bar',
        x='Make',
        y='Average_Price',
        color='blue',
        alpha=0.3,
        ax=ax,
        title="Comparison of Average Make Prices"
    )
# Allows for placing the matplotlib into a frame, which is then drawn and packed into the frame properly.
    canvas = FigureCanvasTkAgg(fig, master=makeframe)
    canvas.draw()
    canvas.get_tk_widget().pack(side=tk.TOP, fill=tk.BOTH, expand=1)

# Function to create the bar plot for Model (Reused from previous Documentation)
def plot_model():
# Create a figure and axes object to plot
    fig, ax = plt.subplots()
    avg_price_by_model.plot(
        kind='bar',
        x='Model',
        y='Price',
        color='blue',
        alpha=0.3,
        ax=ax,
        title="Comparison of Average Model Prices"
    )
# Allows for placing the matplotlib into a frame, which is then drawn and packed into the frame properly.
    canvas = FigureCanvasTkAgg(fig, master=modelframe)
    canvas.draw()
    canvas.get_tk_widget().pack(side=tk.TOP, fill=tk.BOTH, expand=1)

# Create buttons/text to reveal the plots and the dataframe and help text (Calling the function for revealing the graphs and the dataframe and help text)
Plot1_Make = tk.Button(makeframe, text="Plot Make Prices", command=plot_make)
Plot1_Make.pack(pady=10)

Plot2_Model = tk.Button(modelframe, text="Plot Model Prices", command=plot_model)
Plot2_Model.pack(pady=10)

DataFrame_Model = tk.Button(dataframe, text="Show DataFrame", command=data_make)
DataFrame_Model.pack(pady=10)

Help_Text = tk.Label(helpframe, text="For context, the first frame contains the average brand pricing, second frame average model pricing of each brand, and the third frame contains all the information in the original dataframe.")
Help_Text.place(x=50, y=50)

# Finally reveal the GUI
root.mainloop()


**Evaluation**

- Date: 18/08/24 - 19/08/24

- Adjustments/Changes: Added a help bar, this allows for easy accessibility of the program and understandability.

- Did it Work: Using the same method for creating tabs and labels, I used the same material to create a Help tab which was successful.

- What do I do next: Make a README file, and finish testing and evaluating.