<div class="alert block alert-info alert">

# <center> Scientific Programming in Python
## <center>Karl N. Kirschner, Ph.D.<br>Bonn-Rhein-Sieg University of Applied Sciences<br>Sankt Augustin, Germany

# <center> Course Introduction

<hr style="border:2px solid gray"></hr>

## A sneak-peak into our future

Highlights:
- <font color='dodgerblue'>importing</font> a library
- range function (and unpacking it using a <font color='dodgerblue'>* (star)</font>)
- <font color='dodgerblue'>readable</font> object names (e.g. "x_values" versus something like "x1")
- <font color='dodgerblue'>f-string</font> statements
- an object's type
- list comprehension
- <font color='dodgerblue'>proper and consistent</font> use of spacing
- plotting data

In [None]:
import matplotlib.pyplot as plt


x_values = range(1, 8, 1)
y_values = [item**2 for item in x_values]  ## a list comprehension

print(f'The x_values is of type: {type(x_values)}.\n')

print(f'The values for the x-axis are: {[*x_values]}.')  ## The '*' is an argument-unpacking operator.
print(f'The values for the y-axis are: {y_values}.')

plt.figure()

plt.plot(x_values, y_values,
         marker='.', markersize=24,
         linewidth=5, linestyle='-',
         color='red')

plt.xlabel('X Label (unit)')
plt.ylabel('Y Label (unit)')

plt.show()

<hr style="border:2px solid gray"> </hr>

# Keys to success in this course

### <font color='dodgerblue'>$\textrm{C}^3$</font>: code is  written <font color='dodgerblue'>concisely</font>, with a <font color='dodgerblue'>clear</font> thought process, and placed into <font color='dodgerblue'>context</font>

1. **Concise**, cleanly written code and output
    - Easy to read and understand
    - Reduced chances of introduction programmer error
    - Easier to debug


2. **Clear** thought process with a logical grouping of code
    - User-defined functions that contain a single concept
    - Logical separation and isolation of individual ideas (e.g. using separate code cells & user-defined functions)
    - Promotes usability and reusability in future code (i.e. user-defined functions)
    - Easier to debug


3. **Context** for the code's a) purpose and b) usage are provided
    - Block comments, in-line comments and docstrings (e.g user-defined functions) - purpose, usage, special notes
    - Jupyter-notebook markdown language (citations, data interpretation)


### <font color='dodgerblue'>K.I.S.S.</font>: <font color='dodgerblue'>K</font>eep <font color='dodgerblue'>I</font>t (i.e. coding) <font color='dodgerblue'>S</font>imple & <font color='dodgerblue'>S</font>mart
- $\textrm{C}^3$ - concise, clear and context
- Use of built-in functions over libraries with large overhead
- User-defined functions for reproducibility, reuse, error reduction and isolating ideas

<font color='dodgerblue'>$\textrm{C}^3$ and K.I.S.S. are the same keys to success that are found in **all** scientific working.</font>

### Academic Scholarship:

**You will need to do this for your thesis.**

- **Citing sources** of existing knowledge
    - Providing credit to scientists and programmers
    - Helps to enable reproducibility
    - Indicates that you are well-educated (and you pay attention to details)
    
- **Communicating** your thoughts/ideas (less assumption occur then)

- Writing
    - $\textrm{C}^3$
    - K.I.S.S.
    - Using complete sentences (i.e. a subject, a verb and usually an object)

- Providing units for numbers when appropriate


<hr style="border:2px solid gray"> </hr>

## Scientific Programming

**Definition**
1. Programming whose goal is for scientific usage (e.g. workflows, data analysis) and visualization.
2. Learning to program in an academic, scholarly manner (i.e. wissenschaftliches Arbeit).

**3 Ways to Think About It**
1. **Usage**: to perform mathematics (from simple numerical computations to complex math models)

2. **Practice**: to create while maintaining good scholarship
    - knowing what is state-of-the-art
    - careful, clear and supportive
    - "A machine requires precise instructions, which users often fail to supply" [1]

3. **Target**: to support science (doing research, data support and analysis)
    - "Scientists commonly use languages such as Python and R to conduct and **automate analyses**, because in this way they can **speed data crunching**, **increase reproducibility**, protect data from accidental deletion or alteration and handle data sets that would overwhelm commercial applications." [1]
    - Create workflows to help do the research
    - Create simulations (increasingly becoming more important in research)
        - exploratory: for understanding raw data
        - supportive: for strengthening interpretations of the data
        - predictive: creating new ideas

[1] Baker, Monya. "Scientific computing: code alert." Nature 541, no. 7638 (2017): 563-565.

<hr style="border:2px solid gray"> </hr>

<div class="alert block alert-info alert">

## Why is this Important?
<br>

- "Societally important **science relies on models and the software implementing them**. The scientific community must ensure that the findings and recommendations put forth based on **those models conform to the highest scientific expectation**" [2]

[2] L. N. Joppa et al. , “Troubling Trends in Scientific Software Use,” Science, 340(6134), 814–815, 2013.

**Positive Example:**
ChemRxiv (a pre-print science paper archive) sent out an email on March 20th, 2021 highlighting the recent research submissions on the Coronavirus. 9 out of the 10 highlights had a significant amount of computer modeling (computational chemistry). In other words, computer models were the first to get some research results that target the pandemic.


![image](00_images/ChemRxiv_coronavirus_2020.png)


**Negative Example:**
- A 2001 article and its 2005 retraction: G. Changet al., “Retraction,” Science, 314(5807), 1875, 2006
    - One of the two top scientific journals to pulbish in
    - Highly respected
    - Peer-reviewed - who very likely **did not see the analysis code** that was written and used by the authors
        - a case for open source software

![image](00_images/ChangRRPCC2006.png)

<hr style="border:2px solid gray"> </hr>

## Why Python?

- Accessible and readable (especially to people outside of computer science)
    - Natural scientists
    - Engineers
- Powerful due to the number of libraries available that are created by domain experts and programmers
- Good for creating larger programs - create functions that do specific tasks
- Call Bash commands from inside of python (via "import os")


Most popular programming languages (via reference [3])
1. JavaScript
2. **Python**
3. Java
4. PHP
5. C#
6. CSS
7. C++

![image](00_images/redmonk_2022.png)

3. Stephen O'Grady, "The RedMonk Programming Language Rankings: June 2022", https://redmonk.com/sogrady/2022/10/20/language-rankings-6-22/, March 28, 2022. Accessed on March 27, 2023.

<hr style="border:1px solid gray"> </hr>

## The Disciplines that use Python - Its "Main" Catagories

For a Python job, one should know a) python's core and b) one of the following categories:

- Web - flask, django, html, javascript
- Data engineering (collecting data) - sql, airflow, luigi
- Software engineering - git, unit testing, large codebases
- Cyber security - requests, volatility, pyew, peepdf, penetration tests
- **Data science / scientific python** (<font color='dodgerblue'>seeking new knowledge</font>)

## What are the most importnat libraries to know?

### Top imported libraries in 150 GitHub computational chemistry + machine learning (i.e. natural scientist who know about coding) repositories
- <font color='dodgerblue'>Numpy</font>
- <font color='dodgerblue'>Pandas</font>
- PyTorch
- sklearn
- <font color='dodgerblue'>Matplotlib</font>
- <font color='dodgerblue'>SciPy</font>
- TensorFlow
![image](00_images/top_compchem_libraries.png)

<hr style="border:2px solid gray"> </hr>

## Misc. Background

- General-purpose object-oriented programming language


- Contains programming common concepts like
    - statements
    - expressions
    - operators
    - modules
    - methods
    - classes


- Has options for IDE (Integrated Devlopment Environment) usage: [IDLE](https://docs.python.org/3/library/idle.html), [PyCharm](https://www.jetbrains.com/pycharm/download), [Visual Studio Code](https://code.visualstudio.com/), [Sublime](https://www.sublimetext.com/)


- OS independent


- Good for scientists/researchers dealing with lots of data


- Not the best option for developing fast parallel programs


- Fun to use


- Python interpreter: python3 (python2 is out-of-date)

<hr style="border:2px solid gray"> </hr>

## Getting Python3

- Having Python (and desired libraries) installed onto your computer
    - It may already be installed
    - If not: https://www.python.org/downloads


- Miniconda (https://docs.conda.io/en/latest/miniconda.html)
    - Open-source package and environment management system
    - THE way to manage *isolated* Python environments and libraries

<hr style="border:2px solid gray"> </hr>

## Write and Execute a Code

In order of general helpfulness and importance:

1. <font color='dodgerblue'>**Google's Colaboratory**</font>: https://colab.research.google.com
    - Written in a browser and online
    - Execute: directly and online


2. <font color='dodgerblue'>**Jupyter Notebooks**</font>: https://jupyter.org - Recommended for novice and experienced programmers
    - Written online and offline
    - Execute: directly and offline


3. Text editor and integrated development environment (IDE)
    - Written using Simple editors (e.g. texteditor, gedit) - can be problematic and tedious
    - Written using IDE (sophisticated)
    - Exucute:
        - In a terminal (e.g., 'python program_name.py') and offline
        - Directly through the IDE and offline


4. Starting Python3 in a terminal (e.g., bash)
    - Linux: 'Menu' -> 'Terminal' -> 'python3'
    - Macintosh: 'Application' -> 'Utilities' -> 'Terminal' -> 'python3'
    - Windows: ? -> 'python3'
    - You can exit by: typing 'exit()' or 'Cntl D' ('Strg D') key strokes
    - **Warning**: Done in real-time, and thus the code and results are not not saved

<hr style="border:2px solid gray"> </hr>

## Important Course Information

1. Lectures will be ca. 90 minutes long


2. We will make use of LEA, including handing in projects


3. I am available through university email (I tend to respond to these fairly quickly)


4. Individual/group online meetings can be made upon request


5. We will use the WebEx software, and adjust if needed


## Homework Instructions

### Coding and Turning In Homework:

Soltion will

1. be written in **English**,
2. include your **name** at the top of the notebook,
3. be written **independently** (i.e., no plagarims), and
4. be turned in as a **Jupyter Notebook** file via **LEA**.

<hr style="border:2px solid gray"> </hr>

### Grades

**Homeworks will be given points from 0-100**<br>

**Mark given for point range**<br>
1.0: 100 -- 95<br>
1.3: 94 -- 90<br>
1.7: 89 -- 85<br>
2.0: 84 -- 80<br>
2.3: 79 -- 75<br>
2.7: 74 -- 70<br>
3.0: 69 -- 65<br>
3.3: 64 -- 60<br>
3.7: 59 -- 55<br>
4.0: 54 -- 50<br>
5.0: 50 -- 0<br>

<font color='dodgerblue'>**See the course syllabus on LEA for more complete information.**</font>

<hr style="border:2px solid gray"> </hr>

# Python Easter Eggs

## 1. Python's Philosophy Through a Poem.

This poem contains many of the important points about Python and its coding

In [None]:
import this

Source code is actually (https://github.com/python/cpython/blob/main/Lib/this.py), which is done using a Rot13 substitution cipher (https://en.wikipedia.org/wiki/ROT13).

## 2. Experience Antigravity

In [None]:
import antigravity