# Introduction to GitHub Copilot for Data Science

---
## Intro

This repository contains the source code for the complete workshop. You will follow the step-by-step guide below, completing all the steps while working with data and GitHub Copilot within Codespaces.

> 📝 **Note:**
> This repo is intended to give an introduction to various **GitHub Copilot** features, such as **Copilot Chat** and **inline chat**. Hence the step-by-step guides below contain the general description of what needs to be done, and Copilot Chat or inline chat can support you in generating the necessary commands.
>
> Each step (where applicable) also contains a `Cheatsheet` which has some suggested prompts if you get stuck.

> 💡 Play around with different prompts and see how it affects the accuracy of the GitHub Copilot suggestions. For example, when using inline chat, you can use an additional prompt to refine the response without having to rewrite the whole prompt.

### Data Science Project features

In this workshop, you will be working with CSV data included in this repository, as well as a Jupyter Notebook that starts some analysis of the data. Here are some features of the project you will work with:

1. Review a CSV dataset and clean the data
1. Fast track creating a new analysis notebook from scratch
1. Identify correlations and create polished, exportable results

## Workshop Preparation


This repository is Codespaces-ready and is pre-configured so that you have all dependencies installed including the Visual Studio Code extensions necessary to work with GitHub Copilot, Jupyter Notebooks, and Python:

- GitHub Copilot
- Python extension
- Jupyter extension
- Data Wrangler extension
- Pre-installed Python dependencies with an activated Virtual Environment

> ❗
> If using this repository in your account or a non-GitHub-Universe organization, you might incur in charges or consumption of your free quota for Codespaces.

### 1. Create a new repository from this template

⏳ **~2min**

- Click `Use this template` $\rightarrow$ `Create a new repository`
- Set the owner to your orginization or if you have none, your personal account.
- Give it a name
- Set visibility to `Private`
- Click `Create repository`

### 2. Create a Codespace using the provided template

⏳ **~3min**

- In the newly created repo, click `Code` $\rightarrow$ `Codespaces` $\rightarrow$ `[ellipsis menu]` $\rightarrow$ `New with options` $\rightarrow$ _Ensure that `Dev container configuration` is set to `Default project configuration`_ $\rightarrow$ `Create codespace`
- ❗If you're having problems launching the Codespace then you can also clone the repo and continue from here in your IDE:

    ```sh
    git clone https://github.com/<YOUR_NAME_SPACE>/<YOUR_REPO_NAME>.git
    cd <YOUR_REPO_NAME>
    ```

> 📝 **Note:** There is no need to push changes back to the repo during the workshop

### 3. Verify Python is installed and set correctly

⏳ **~2min**

- Use the command palette to toggle the terminal (search for "Create new terminal")
- Run `which python` and make sure it points to the Virtual Environment (`home/vscode/venv/bin/python`)
- Run `which pip` and ensure that it also points to the Virtual Environment (`home/vscode/venv/bin/pip`)

### 4. Open relevant files

⏳ **~2min**

GitHub Copilot benefits from having context. One way to enhance context is by opening relevant files, else you can add them into your prompt context later.

> ❗If you see that the Copilot icon in VS Code's bottom bar, right side has a slash through it, click it and enable Copilot

- Open the `data/acnh-data/villagers.csv`, `data/acnh-data/fish.csv` and `workshop/villagers_analysis.ipynb`  files.

## Data Wrangling

### 1. See how much you can learn about the project and the data

⏳ **~5min**

- Open GitHub Copilot Chat (click on the sparkling thought bubble in the top of the window of VS Code)
- Make sure the mode dropdown in the bottom left corner of the chat box is set to "agent"
- Use the `@workspace` agent to ask Copilot what is the nature of the data you are going to work with
- Also ask `@workspace` what sort of python packages the project uses, and in what ways

<details>
<summary>Cheatsheet</summary>

##### Prompt

```sh
@workspace Tell me about this project
@workspace What python packages does this repo use and why?
```

</details>

### 2. Review the data files with Data Wrangler

⏳ **~5min**

- Right click on `data/acnh-data/fish.csv`, and open it with Data Wrangler. If it needs a kernel selected, choose the the `venv` Python environment
- Scroll around and look at the state of the data, noting all of the column summaries at the top
- Using the Copilot agent built into Data Wrangler in the panel below the grid, ask it to change the values of the `Rain/Snow Catch Up` column to a bool type with 0 as false
- Click the apply button below the Copilot generated code to apply the transformation
- Export your new data as a csv file `fish-cleaned.csv` via the button just above the grid that says `Export to CSV`

> 📝 **Note:**
> There is more cleaning we could do, but we will get Copilot to help more in future steps.

>💡 If you need to repeat the same cleaning process many times, you can export your steps in Data Wrangler to a reuseable notebook or as python code to your clipbord!


<details>
<summary>Cheatsheet</summary>

##### Prompt

```sh
Convert this column to a bool type where No is 0
```

</details>

## Initial analysis

### 3. Start a new notebook for the cleaned fish data

⏳ **~5min**

- Open the Copilot chat panel from the icon in the top menubar or the command pallet (View: Toggle Chat)
- Click the ➕ button in the top of the panel to start a new chat
- Make sure you have your new `fish-cleaned.csv` file open in the editor to make things easy.
- In the text box at the bottom click the add context button and select your newly cleaned `fish-cleaned.csv`
- In the bottom left of the chat box, select the `ds-create` mode for the chat.
- Ask Copilot to start a new Jupyter Notebook for this data in the chat panel text box


<details>
<summary>Cheatsheet</summary>

##### Prompt

```sh
Can you start a new notebook for this fish data?
```

</details>

### 4. Review and edit visualizations inline in notebooks

⏳ **~3min**

- In your newly created notebook, go to a cell output that has a plot
- Click and drag it to the Copilot chat panel and ask directly for it to change something about the plot (you can also add it by clicking the `...` to the top right of the cell)


<details>
<summary>Cheatsheet</summary>

##### Prompt

```sh
Can you make all of these plots higher dpi and using colorblind safe color pallets?
```

</details>

## Extending and refining notebooks

### 5. Continue the villager analysis with the `ds-categorical-analysis` agent

⏳ **~5min**

- Open the `villager_analysis.ipynb` file and the Copilot chat side panel
- Click the ➕ button in the top of the panel to start a new chat
- Add the `villager_analysis.ipynb` notebook to the context of the chat and switch the agent mode in the bottom left to the `ds-categorical-analysis` mode.
- Ask Copilot to extend the analysis of the villagers and see if there are any villager traits that are corelated.


<details>
<summary>Cheatsheet</summary>

##### Prompt

```sh
Help me figure out if any properties of the villagers are correlated.
```

</details>

### 6. Prepare graphics for export

⏳ **2min**

- In a cell generating plots in the notebook, open inline Copilot chat (Ctrl-i / Cmd-i) and ask the agent to help add a method to export the plots in that cell as high resolution pngs

<details>
<summary>Cheatsheet</summary>

##### Prompt

```sh
Export all the plots in this cell as high DPI PNGs 
```

</details>

### 7. Create Shareable output

⏳ **2min**

- With the Notebook you created open, you should be able to find a button in the top toolbar of the notebook (may be under the `...`) to export your notebook
- Choose HTML or PDF and check the output by right clicking on the file in the VS Code file browers and choose `Show Preview` for HTML or just open the PDF

## Clean-up

### 7. Delete your Codespace

⏳ **~1min**

Before deleting, if you wish, you can push your changes. Remember workshop repositories are temporary too.

Go to https://github.com/codespaces and find your current running Codespace and delete it.

## Additional resources

If you want to learn more about using GitHub Copilot, check out these resources:

* [GitHub Copilot Documentation](https://docs.github.com/copilot)
* [VS Code video series: GitHub Copilot](https://www.youtube.com/playlist?list=PLj6YeMhvp2S7rQaCLRrMnzRdkNdKnMVwg)
* [Blog: Best practices for prompting Copilot](http://blog.pamelafox.org/2023/06/best-practices-for-prompting-github.html)

Also check out the [GitHub Foundations learning path](https://learn.microsoft.com/training/paths/github-foundations/) for more resources on GitHub and GitHub Copilot.