# Design of a simple Dash application using GitHub Copilot

## Overview

I had previously worked on a miniproject to create a Dash application showing useage patterns for the city of Boston's Blue Bikes system. While this project was never finished, it was still in the back of my mind at the time that I acquired a subscription to GitHub Copilot. I decided to continue the Blue Bikes project using Copilot, as a test case for pair programming with an AI system. The results were encouraging. While it was necessary to guide Copilot away from some dead ends, it was particularly useful when working with Dash, which requires many components in a particular order.

## Project background

The city of Boston operates a system known as Blue Bikes, which allows users to rent a bike using an application. As part of the city's data portal, this data is available for developers. 

The data, which is divided into monthly files, contains information on each trip, including the start and end stations, the duration, and the user type. The data is available at https://www.bluebikes.com/system-data.

The goal of the project was to create two classes: one to read the data into a local PostgreSQL database, and one to create a Dash application showing useage patterns by neighbourhood, time (day of week, hour of day and month of the year), and user type.

## Data

Three datasets were necessary to develop the application: the Blue Bikes data (including journey data and station locations) and a shapefile of Boston neighbourhoods. The Blue Bikes data was downloaded from the city's data portal. The shapefiles were downloaded from the city's GIS portal.

## Initial development

It was initially planned to write all the necessary tables - journeys, stations and neighbourhoods - to the database. However, this substantially increased the needed processing time. The steps were therefore carried out locally, with the final table (containing neighbourhood and time information) being written to the database as journeys_enriched.

The successful creation of the table was the most challenging part of the project. One issue was Copilot's tendency to predict future text or code based on locally available text or code. This had two consequences:

1. The code would often repeat itself, as Copilot would predict the same code as previous lines.
2. If a change in approach had to be made, Copilot would often continue to use the old approach until prompted not to do so.

It was, however, possible to overcome these issues by prompting Copilot, either in the initial design or with in-line comments.

These issues were also present in the Dash application, where Copilot would often suggest code based on previous versions of the journeys_enriched table. Consequently, it was necessary to write the first draft of the class creating the Dash application in a separate text file.

## Results

A preview of the application can be seen beneath:

In [1]:
from IPython.display import Image
from IPython.core.display import HTML 
Image(url= "app_preview.jpg")

The application is quite simple. However, the goal of this project was to test the use of Copilot in a real-world application. The results were encouraging. While it was necessary to guide Copilot away from some dead ends, it was particularly useful when working with Dash, which requires many components in a particular order. It was also useful when writing SQL queries, as it was able to suggest the correct syntax for the queries.

## Reflections

I observed the following while using Copilot:

- The biggest challenge so far is Copilot's programmatic nearsightedness (its tendency to predict future text or code based on locally available text or code). This meant that on occasion, Copilot went down dead ends by repeating code or continued to use old approaches.

- I had previously worked on a manual version of this project, and was therefore able to guide Copilot through the use of complex packages such as geopandas. It would have been difficult to do so either completely from scratch, or without experience with these packages.

- The more detail given in a design specification, the better. However, Copilot still occasionally misses instructions (it is not, for instance, able to reliably generate docstrings or argument typing)

## Next steps

As of 09.06,2023, I intend to continue with the following steps and new features:

1. The creation of a Folium map to show usage patterns by neighbourhood visually.
2. The creation of a Folium map to show the most popular routes (an idea suggested by Copilot).
3. The creation of a Folium map to show the most popular stations (an idea suggested by Copilot).
