# Introduction

One of HiSparc's goals is to bring real science to high school classrooms. This document is part of that. Here you will find walkthrough of Python code that does callibration for HiSparc detectors. Along with explanation of bits of the code, this notebook also provides excersises to help you get used to Python and real data analysis.

If you follow this entire notebook you will have learned the following:
* Why Python is used in modern science.
* What Python packages are and what they do.
* How to read data files using Python.
* What binning is and why it's useful.
* How to make a simple graph using Python and Pylab.
* How to read and interpret your graphs.
* How to use the code in this notebook to check if your school's detectors are working well.

A lot of functions of the code in this notebook are beyond the scope of these teaching goals. To still try to satisfy the curious reader some additional reading will also be given so that you can try and figure out how it works for yourself. Perhaps in the future this notebook will expanded to also explicitly cover the more complicated aspects of the code.

Before we get started, this notebook will assume you already know some things. These subjects are listed below. If you're note familiar with one or more of them make sure you read or ask a teacher about them before you continue. 
* You are familiar with high school level mathematics.
* You know what HiSparc is and what research they do.

Now that all that is out of the way let's begin.

# Using Python

Python is a programming language often used by scientist to make scientific models or to analyse data. Perhaps you've already made a graph using Microsoft Excel before or even some basic modelling using programs like CMA's Coach. These programs are great for small data sets or smaller simulations, however when doing new science we often work with Terabytes of data, if not more. When working with so much data excel will simply crash, let alone let you manipulate the data or make graphs. 

To test this let's try to open the following file with excel: *LINK TO GITHUB*

This is slightly modified data file from HiSparc. You can use the link above to download it, it's about 200MB. Once you've done that open it with windows notepad (kladblok). It will open quite quickly, unfortunatly notepad has less graphing capability than an actual physical notepad. Don't close the notepad just yet. Let's open the file in excel as well first. To do this open excel and go to file -> open -> browse. Make sure to select "all files" (this can be found in the bottom right of the browsing window) in the browsing window. Now select the file you just opened. It will open a new window to let you import a .txt type file into excel.  Make sure to select "delimited" (this should be automatically selected for you) in this window and then press next. In the next window make sure to also select "other" and type "." (without the "", so just the period) in the text field next to other. Once you've done that, you can press finish. Excel will now freeze, this is normal after some time (you can try to count how long it takes if you wish) the file will open. 

It will probably give an error message telling you it could not load the entire file. We can check how much of the data it actually loaded by scrolling all the way down and checking the date (left most colomn). You can compare that to the latest date in notepad, they probably don't match. This is because excel just gives up trying to load more data after a while. Needless to say this is really bad if we want to actually use all the data.

Using python we can actually use the entire data set. So let's set that up here. This notebook has a very nice property, aside from displaying text it can also run python code like below. 

In [3]:
print("Hello World!")

Hello World!


The bit of code above is usually the first bit of code anyone trying to learn programming will write. It simply writes the text "Hello World!" in the output line. You can run the code by clicking on it to select the block of code and then pressing ctrl + enter or by pressing the play (run) button on the top of this window. When you do so the text "Hello World" will appear below the code. That area is called the output line. It is where any text, images or error messages will get displayed by the code.

Before we can begin programming ourselves I highly recommend you look up an online Python tutorial to learn more. Here are some examples:
* Example 1
* Example 2
* Example N

The first thing we will want to do is import some packages. Python can do a lot on it's own already but not everything. For example making a graph isn't possible without first writing an entire complicated program that can make them for you. This is where packages come in. One of the wonderful things about python is that many others have already written programs for many of the things you might want to do and then made those programs available for the public in packages. These packages can be installed and imported so that you can use the code other have already written. In the code block below we will import some packages for our program that will allow us to do more complicated math and make our own graphs easily.

In [2]:
#Import Packages
import time

import numpy as np

import pylab as pl

import tkinter.filedialog as tk

from scipy.optimize import curve_fit

Now that we have done that we can begin by importing data from HiSparc into our program in a way that we can use.

# Importing HiSparc Data

First you will need to download some real data from the HiSparc website here: https://data.hisparc.nl/data/download/.
Here you must select a station of your liking. We recommend 501: Nikhef or your own school if it has a detector with some recently collected data. For the date we recommend selecting a recent starting date and the day after it as an end date. You can select a larger timespan but this increases the size of your data file by a lot. This will make the download take longer as well as cause the program to take drastically longer to finish its computation. 

Next we need to get the information in the file you just downloaded into python. We'll be using the package "Tkinter" for this. Tkinter is a package that let's you import files into your code with a nice window like you're used to. First we need to set up tkinter. We do this by creating a "root" window. This is an empty window in which we can display things we want. But we don't want to display an empty window so we'll make it invisible right away as well. Then we can open a file selection window on top of the empty window. This window will then import the file(s) we selected into python, we will call them "files" to refer to later in the code. Last we need to close the window. To do all this we need the following code. Note that everything after the "#" simple doesn't add to the code itself but is instead a note describing what a line or some lines do.

If no window opens when you run the cell below, use "alt+tab" to cycle through all open windows. You will most likely find it there.

In [3]:
root = tk.Tk() #Create root window
root.withdraw() #Make the root window invisible
files = tk.askopenfilenames(parent=root, title='Choose any files') #Open a file explorer and import select files as "files"
root.destroy() #Close all windows

If this seems like a lot at once, that's okay because it is. You don't have to be able to do this yourself. The important take away is that we now have our file(s) imported into python and called them "files".  Notice that we could select more than one file. This is because "files" is actually list in which we can put multiple files. We need to keep that in mind in the future. 

We're not done yet however. Currently we have list with our files in it, however we want python to read what is in those files. Try to open the file you just downloaded with notepad to see what it looks like. It should look something like this:
![title](img/file_example.png)

# Making a Graph 

# Analyzing the Graph

# Comparing Stations