# Why Python

# TOC

* What is Python?
* Why Python? 
* How does Python relate to Matlab/R?

Adapted from Tal Yarkoni's [slides](https://github.com/neurohackademy/introduction-to-python/blob/master/introduction-to-python.ipynb)

# What is Python?

* Python is a programming language
* Specifically, it's a widely used, very flexible, high-level, general-purpose, dynamic programming language
* That's a mouthful! Let's explore each of these points in more detail...

### Widely-used
* Python is the fastest-growing major programming language

<img src="images/languages.svg">

[source](https://insights.stackoverflow.com/trends)

### High-level
Python features a high level of abstraction
* Many operations that are explicit in lower-level languages (e.g., C/C++) are implicit in Python
* E.g., memory allocation, garbage collection, etc.
* Python lets you write code faster

#### File reading in Java
```java
import java.io.BufferedReader;
import java.io.FileReader;
import java.io.IOException;
 
public class ReadFile {
    public static void main(String[] args) throws IOException{
        String fileContents = readEntireFile("./foo.txt");
    }
 
    private static String readEntireFile(String filename) throws IOException {
        FileReader in = new FileReader(filename);
        StringBuilder contents = new StringBuilder();
        char[] buffer = new char[4096];
        int read = 0;
        do {
            contents.append(buffer, 0, read);
            read = in.read(buffer);
        } while (read >= 0);
        return contents.toString();
    }
}
```

#### File-reading in Python
```python
open(filename).read()
```

### General-purpose
You can do almost everything in Python
* Comprehensive standard library
* Enormous ecosystem of third-party packages
* Widely used in many areas of software development (web, dev-ops, data science, etc.)

### Dynamic
Code is interpreted at run-time
* No compilation process*; code is read line-by-line when executed
* Eliminates delays between development and execution
* The downside: poorer performance compared to compiled languages

# Why Python? 

* A free software released under an open-source license: Python can
be used and distributed free of charge, even for building commercial
software.
* Multi-platform: Windows, Linux/Unix, MacOS X, mobile phone
OS, etc.
* A very readable language with clear, non-verbose syntax
* Fairly easy to learn
* Multi purpose (data analysis, web scraping, websites...)
* Relevance beyond research

    
excerpt from [scipy-lectures](http://www.scipy-lectures.org/intro/language/python_language.html)

## Why Python
[Kaggle: The State of Data Science & Machine Learning 2017](https://www.kaggle.com/kaggle/kaggle-survey-2017)

![](images/kaggle1.png)

## Why Python
[Kaggle: The State of Data Science & Machine Learning 2017](https://www.kaggle.com/kaggle/kaggle-survey-2017)

![](images/kaggle2.png)

# How does Python relate to Matlab/R?

# How does Python relate to Matlab/R?
* Python competes for mind share with many other languages
* Most notably, R
* To a lesser extent, Matlab, Mathematica, SAS, Julia, Java, Scala, etc.

## R
* [R](https://www.r-project.org/) is dominant in traditional statistics and some fields of science
    * Has attracted many SAS, SPSS, and Stata users
* Exceptional statistics support; hundreds of best-in-class libraries
* Designed to make data analysis and visualization as easy as possible
* Slow
* Can be incredibly [weird](https://ironholds.org/projects/rbitrary/) (especially when you leave the tidyverse). Language quirks drive many experienced software developers crazy. 
* Less support for most things non-data-related

### R/Python

Python | R
----|:----
open | open
great community | great community
CS | statisticians
general purpose | stats focus
easy to learn | can be tricky to learn
decent plotting (seaborn) | very intuitive plotting (ggplot)
decent stats (statsmodels) | great stats functionality 
Jupyter notebooks | RMarkdown notebooks
rendering manuscripts possible (nbconvert, knitpy) | works very well (knitr ect.)


**It can't hurt to dip your toes into both pools and select the right tool for the job at hand.**

### R/Python
* You can run [R from python](https://rpy2.github.io)
* You can run [python in R markdown notebooks](https://rstudio.github.io/reticulate/articles/r_markdown.html)
* Python `pandas` commands vs `R`: [Comparison with R / R libraries](https://pandas.pydata.org/pandas-docs/stable/comparison_with_r.html)


## Matlab
* A proprietary numerical computing language widely used by engineers
* Good performance and very active development, but expensive
* Closed ecosystem, relatively few third-party libraries
    * There is an open-source port (Octave)
* Not suitable for use as a general-purpose language

### Python for Matlab users
* [NumPy for Matlab users](https://numpy.org/doc/stable/user/numpy-for-matlab-users.html)
* [A Python Primer for Matlab Users](http://bastibe.de/2013-01-20-a-python-primer-for-matlab-users.html)
* [Python for Matlab Users](http://researchcomputing.github.io/meetup_fall_2014/pdfs/fall2014_meetup13_python_matlab.pdf)

## So, why Python?
Why choose Python over other languages?
* Arguably none of these offers the same combination of readability, flexibility, libraries, and performance
* Python is sometimes described as "the second best language for everything"
* Doesn't mean you should always use Python
    * Depends on your needs, community, etc.

# Some Python philosophy

In [1]:
import this

The Zen of Python, by Tim Peters

Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!


from [xkcd](https://imgs.xkcd.com/comics/python.png)

![](https://imgs.xkcd.com/comics/python.png)