# Hacking heritage: Day 3, Morning

The GitHub repository with all the notebooks and course materials is here: 

https://github.com/wragge/hackingheritage2020 

Feel free to use and share.

As I mentioned, the JupyterHub server and Slack workspace we've been using will stay accessible for at least a month, so you can continue to work on your projects and chat with each other.

There are instructions below for saving your work into your own GitHub account.

## Possible projects for today

* Continue working through the 'Zoom in' exercise from [yesterday morning](Day2-Morning.ipynb), harvesting datasets of newspaper articles and analysing the results (leave out the Datasette step)

* Explore datasets in the CSV Explorer and document your findings in a notebook (or even just a Google doc). What fields do they contain? What is the range of data? Note that you can save any of the charts in the CSV Explorer as images.

* Try [harvesting the results of a search](../recordsearch/harvesting_items_from_a_search.ipynb) from the National Archives' [RecordSearch](http://recordsearch.naa.gov.au/) database. Save the results as a CSV file, then try uploading the CSV to the CSV Explorer.

* Create a dataset of images – [front covers from journals in Trove](../trove-journals/Get-page-images-from-a-Trove-journal.ipynb)? (browse a [list of the journals](https://trove-titles.herokuapp.com/) here); or [ephemera](../building-blocks/harvesting_ephemera.ipynb) in Trove (posters and such like)

* Create a list of interesting items using the Trove web interface, and then [convert that list to CSV files](../trove-lists/Convert-a-Trove-list-into-a-CSV-file.ipynb) for further analysis.

* Try [harvesting a collection of politicians' press releases and interviews](../trove-journals/Harvest-parliament-press-releases.ipynb) from the Parliamentary Library (via Trove). Save the results as a CSV and upload to the CSV Explorer.

* Compare results of searches in Trove using QueryPic with [visualisations of searches in Papers Past (NZ)](../digitalnz/Visualise-a-search-in-PapersPast.ipynb). Document the results.

* Using the [notebook to analyse Hansard by year and house](../australian-commonwealth-hansard/convert-a-year-to-dataframe.ipynb), create a document compiling data across a number of years.

### Some general hints

* Google is a coder's most import tool – google your error messages! Someone on Stack Overflow has probably already solved your problem.

* Go to the documentation! I always have a tab open for Pandas because I can never remember specific commands, and the same often goes for basic Python functions.

* Let me know if you get any ImportErrors – I might need to install something on the server.


## Saving your work

There's a couple of ways you can save the notebooks and datasets you've been using on this JupyterHub server. The simplest is to simply right-click on any of the files you want to save in the files explorer, and choose the 'Download' option. They'll then be saved.

The second is by connecting the whole collection of notebooks and course materials to your own GitHub account.

1. First of all you need to generate a Personal Access Token for your GitHub account. Follow [the instructions here](https://docs.github.com/en/github/authenticating-to-github/creating-a-personal-access-token) and under 'Select scope' check the 'repo' box. Make sure you keep a copy of the token as you can't retrieve it again later (although you can just make another). Personal Access Tokens offer an extra level of security because you can limit their scope, and delete them when you've finished with htem.

2. Now go to your GitHub account and create a new repository. Just click on the '+' icon in the top menu bar and choose 'New repository'. Give it any name you like, such as 'hhbackup'.

![GitHub url to copy](../images/github-url.png)

3. Once you've created the repository, you'll see something like the image above. Copy the url in the box.

4. Come back to the JupyterHub server, and click on the '+' sign in the top left menu. This will open a new 'Launcher' window.

5. From the Launcher, click on the 'Terminal' button. This will open a terminal where you can use the command line to change the git settings.

6. Enter the following commands in the terminal. First move into the main course folder:

```bash
cd hackingheritage2020
```

7. Then add a link to your GitHub repository (paste the GitHub url you copied as indicated)

```bash
git remote add mybackup YOUR_GITHUB_REPO_URL
```

8. Ok now we can copy the notebooks to your new repo:

```bash
git push mybackup master
```

9. You'll then be asked for your GitHub username & password, but use your Personal Access Token instead of your password.

10. Done! Go and check your repository – all the code should be there (you might have to reload the repository page).

If you make changes to your notebooks after you've created and copied the code to your repository. You'll have to commit and push your changes.

1. In the terminal, type the following to stage changed files:

```bash
git add .
```

2. Then commit the changes (the message in the quotes can be whatever you want):

```bash
git commit -m 'My latest changes'
```

3. Now push the changes on to GitHub:

```bash
git push mybackup master
```

## Guess what? Your new repository is Binder enabled!

I've included in the repository a file called `requirements.txt` which includes all the various Python packages that are used in the notebooks. There's also one called `postBuild` that includes some Jupyter server configuration commands. These two files can be used by Binder to set up an environment that runs all the notebooks in the repository.

1. Copy the url of your GitHub repository -- this is just the normal web address in your browsers location bar, not the special url we used to upload files to the repository.

2. Go to Binder.

![Binder start box](../images/binder-start.png)

3. Paste your GitHub repository url in the appropriate box, and click the **Launch** button. If you want to see what Binder's doing in the background, click 'show' on the 'Build logs' tab.

4. It might take a while to build, but once it's finished it'll start up Jupyter in the 'classic' view, to swap to Jupyter Lab, just replace 'tree' at the end of the url with 'lab'.

5. Use your notebooks! Remember Binder will shutdown inactive notebooks after about 10 minutes, and won't save your work. SO make sure you save anything you want to keep by downloading it!