# Introduction to scientific computing with Python


***
## Why Python?

### A powerful, all-purpose language with a great syntax (*"executable pseudocode"*)

<img src="./_img/xkcd_learning_curve_wave_python.png" width="700px"> 

Source: [Fabien Maussion](http://fabienmaussion.info/)

### Interoperability with other languages

[Guido van Rossum](https://gvanrossum.github.io//index.html) (creator of Python formally known as Benevolent Dictator For Life (BDFL))

_... I never intended Python to be the primary language for programmers ..._

_... bridge the gap between the shell and C ..._


<img src="./_img/guido.png" width="500px">

Source: [SD Times](https://sdtimes.com/guido-van-rossum/python-creator-proposes-type-annotations-programming-language/) (2014
) & [A Conversation with Guido van Rossum](https://www.artima.com/intv/)  (2002)

### Open and encouraging community

<img src="./_img/growth_major_languages-1-1400x1200.png"  width="650px">

Source: [Blog post by David Robinson](https://stackoverflow.blog/2017/09/06/incredible-growth-python/) (September 6, 2017)




### Batteries included and third-party modules

<img src="https://imgs.xkcd.com/comics/python.png" width="500px;">

Source: [xkcd](https://xkcd.com/353/)

***
## Reproducible Research



The term [reproducible research](https://en.wikipedia.org/wiki/Reproducibility#Reproducible_research) refers to the idea that the ultimate product of academic research is the paper along with the laboratory notebooks and full computational environment used to produce the results in the paper such as the code, data, etc. that can be used to reproduce the results and create new work based on the research. Typical examples of reproducible research comprise compendia of data, code and text files, often organised around an __[R Markdown source document](https://rmarkdown.rstudio.com/)__ or a __[Jupyter notebook](http://jupyter.org/)__.


<img src="./_img/pipeline.png" width="800px;">

(after Peng et al. 2006)

***
## Literate Programming

[Literate programming](https://en.wikipedia.org/wiki/Literate_programming) is a programming paradigm introduced by [Donald Knuth](https://en.wikipedia.org/wiki/Donald_Knuth) in which a program is given as an explanation of the program logic in a natural language, interspersed with snippets of macros and traditional source code.

The literate programming paradigm enables programmers to develop programs in the order demanded by the logic and flow of their thoughts. Literate programs are written much like the text of an essay, in which macros are included to hide abstractions and traditional source code.

### _The Scientific Paper Is Obsolete_
by James Somers, [The Atlantic, Apr 5, 2018](https://www.theatlantic.com/science/archive/2018/04/the-scientific-paper-is-obsolete/556676/)

In [12]:
from IPython.display import IFrame    
IFrame('https://www.theatlantic.com/science/archive/2018/04/the-scientific-paper-is-obsolete/556676/', width="100%", height=450)

***
## Data Science

### What is data science?
   

_Data science [...] is an **interdisciplinary field of scientific methods, processes, and systems** to extract **knowledge or insights from data**[...]_ ([Wikipedia](https://en.wikipedia.org/wiki/Data_science))

_It employs techniques and theories drawn from many fields within the broad areas of_
* _mathematics_, 
* _statistics_, 
* _information science_, and 
* _computer science_, in particular from the subdomains of 
    * _machine learning_, 
    * _classification_, 
    * _cluster analysis_, 
    * _data mining_, 
    * _databases_, and 
    * _visualization._

The [Harvard Business Review](https://hbr.org/2012/10/data-scientist-the-sexiest-job-of-the-21st-century)  (2012) referred to data science as "**The Sexiest Job of the 21st Century**".

<img src="./_img/data_science.png"  width="600px"> 

Source: [Michael Barber](https://towardsdatascience.com/introduction-to-statistics-e9d72d818745)