# Introduction


### Course structure

This is a data science course in *Python3* (hereafter referred to just *Python*) designed for participants with basic Python experience. The course will be run over **6 weeks** with the following structure.

- **Monday lectures** will start with information about Python syntax, the Jupyter notebook interface, and move through concepts such as how to write functions and handle data, using the *pandas* and *numpy* packages, how to calculate summary information from a data frame, and approaches to do plotting, modelling and basics of machine learning. Each lecture will conclude with an **assignment**. 

- **During the week (Tuesday-Thursday)**, participants are invited to review the materials presented in the Monday lecture and complete the assignment with the help of an assigned tutor via the Teams chat.

- In the **Friday recap** the trainers will provide a walk-through the assigment and answer any questions


### Aims

The course will cover concepts and strategies for working with data more effectively in Python with the aim of:

- Writing **reusable** code, using Python's **functions, modules and libraries**
- Acquiring a working knowledge of **key concepts** which are prerequisites for advanced programming, data visualisation and modelling, and machine learning
- Expanding knowledge of *Python* with applications to life data sciences


### Audience

This course is open to any colleagues with some basic knowledge of Python. We are so excited that you want to learn Python :) ! We will start with a brief recap on Python basic concepts and we will build up from there. You will set the pace and the amount of material that we will cover.


### Feedback

Questions, suggestions and ideas from participants are welcomed via the Teams chat, e-mail or during the lectures and recaps. Enjoy!


### Obtaining course materials

The course materials (lectures, assignments and solutions) are accessible via GitHub: https://github.com/semacu/202105-data-science-python

We’d like you to follow along with the example code as we go through the course materials together, and attempt the assignment to practice what you’ve learned.

The course materials will be updated throughout the course, so we recommend downloading the most recent version of the materials before each lecture or recap session. The latest notebooks and relevant materials for this course can be obtained as follows:

1. Go to the GitHub page for the course: https://github.com/semacu/202105-data-science-python

2. Click on the green **Code** button (right, above the list of folders and files). This will cause a drop-down menu to appear

3. Click on the **Download ZIP** option. A zip file containing the course content will be downloaded to your computer

4. Move the zip file to wherever in your directories is preferred e.g. home

5. Decompress the zip file to get a folder containing the course materials. Depending on your operating system, you may need to double-click the zip file, or issue a command on the terminal. On Windows 10, you can right click, click **Extract All...**, click **Extract**, and the folder will be decompressed in the same location as the zip file

6. Launch Jupyter Notebook. Depending on your operating system, you may be able to search for \"Jupyter\" in the system menu and click the icon that appears, or you may need to issue a command on the terminal. On Windows, you can hit the Windows key, search for \"Jupyter\", and click the icon that appears. For more information on installing Jupyter Notebook, please continue reading the next section

7. After launching, the Jupyter notebook home menu will open in your browser. Navigate to the course materials that you decompressed in step 5, and click on the lecture or recap notebook of the week to launch it.


### Python and Jupyter

**Python** is a general purpose programming language that is useful for writing scripts to work effectively and reproducibly with data. It was initially created by Guido van Rossum in 1991. Python is widely used in data science, bioinformatics and scientific computing, as well as in academia and industry. 

It is available in all popular operating systems (Mac, Windows and Linux). The default Python installation comes with "batteries included" and the standard library (some of which we will see in this course) provides built-in support for lots of common tasks e.g. numerical & mathematical functions, interacting with files and the operating system ... There is also a wide range of external libraries for areas not covered in the standard library, such as *pandas* (the Python ANalysis DAta Library), *matplotlib* (the Python plotting library) and *biopython* which provides tools for bioinformatics.

**Jupyter** is a nonprofit organization created to "develop open-source software, open-standards, and services for interactive computing across dozens of programming languages". Jupyter supports execution environments and has developed and supported the interactive computing products e.g. Jupyter Notebook, which we will be using during this course. 

The Jupyter Notebook is a web application that allows you to create and share documents that contain live code, equations, interactive data visualizations and explanatory text.

**How to run Python?** Python is an interpreted language, this means that your computer does not run Python code natively, but instead we run our code using the Python interpreter. There are three ways in which you can run Python code:

- Directly typing commands into the interpreter: good for experimenting with the language, and for some interactive work e.g. using the command line and/or IPython
- Using a Jupyter Notebook: great for experimenting with the language, as well as for sharing and learning
- Typing code into a file and then telling the interpreter to run the code from this file: good for larger scripts, and when you want to run the same code repeatedly




# Installations

Before starting this course, you need to have Python3 and Jupyter installed on your computer. If you do not have these installed already, we recommend installing Anaconda (a complete programming environment including Python3 and Jupyter) by following the instructions below:

### Windows

1. Open the AZ Software Store. The home screen should look like the following:

<img src="../img/az_softwarestore_1.png">


2. In the "Search Catalog" bar at the top, search for "anaconda". This should return "Anaconda3 2019.10":

<img src="../img/az_softwarestore_2.png">

3. Click the "Add to Cart" button and this should add it to to your basket. *Note: make sure you have added the correct version of Anaconda to your cart ("Anaconda3 2019.10"), as other versions e.g. "Anaconda 5.3" may not be suitable for the contents of this course. If using one of the latest versions of the Software Store, you may need to add Anaconda Navigator instead*

4. Click the Cart icon on the top right of the screen, and a preview of the contents of your cart should be displayed:

<img src="../img/az_softwarestore_3.png">

5. Click the "View cart and checkout" button, and you should be taken to a summary of your basket:

<img src="../img/az_softwarestore_4.png">

6. Click the buttons "Me on machine" and "Install Anaconda3 2019.10", then click the "Next" button. *Note: if you are using an older version of the Software Store you may have to check that the "Receive ASAP" option is selected.

7. Click the "Submit" button. You should go through to a "Request Complete" screen. Anaconda will be installed on your computer within a few hours. 

If the steps above do not work, you may want to get in touch with AZ IT. If you can't get hold of IT on time, try the following [link](https://docs.anaconda.com/anaconda/install/windows/)

Please let us know via the Teams channel if you experience any issues with installations prior or during the course. We have a dedicated 1h troubleshooting session before lecture 1 dedicated to resolve any installation problems.


### Linux and macOS

Click the following links for installing Anaconda on your [linux](https://docs.anaconda.com/anaconda/install/linux/) or [macOS](https://docs.anaconda.com/anaconda/install/mac-os/) distributions


### Anaconda installation check

Once you have installed Anaconda, it is a good idea to check that your installation is working ok: 

1. Open the "Anaconda Navigator (Anaconda3)" program, and you should get this home screen:

<img src="../img/anaconda_navigator.png">

2. Click the "Launch" button underneath the Jupyter Notebook icon

3. A tab should now open on your web browser, showing the Jupyter logo at the top and your file system below. You can click the "New" button in top right and a dropdown menu will appear. Then click "Python 3" - it will open up a blank new notebook for you.

If the above steps do work, Anaconda (and Python and Jupyter) should be installed fine
