<!--NAVIGATION-->
<span style='background: rgb(128, 128, 128, .15); width: 100%; display: block; padding: 10px 0 10px 10px'>< [Welcome](00.00-Welcome.ipynb) | [Contents](00.00-Index.ipynb) | [Installing Python & Tools](01.02-Installing-Python.ipynb) ></span>

<a href="https://colab.research.google.com/github/eurostat/e-learning/blob/main/python-official-statistics/01.01-Why-Python.ipynb"><img align="left" src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open in Colab" title="Open and Execute in Google Colaboratory"></a>

<a id='top'></a>

# Why Python?
## Content  
- [What is Python?](#what)  
- [Some Numbers](#some)  
- [Some more numbers](#more)  
- [Features](#features)  
- [Syntax & Design](#syntax)  
- [Common Uses](#common)  

Let's find together if Python is a good choice as scientific/statistical tool, and if yes, why. But first ...

<a id='what'></a>

## What is Python?
- Python is a general-purpose programming language conceived in 1989 by Dutch programmer Guido van Rossum.
- Python is free and open source, with development coordinated through the Python Software Foundation.
- Python has experienced rapid adoption in the last decade and is now one of the most popular programming languages.

<a id='some'></a>

## Some Numbers
It is difficult to pick the most meaningful comparison between programming languages. There are a lot of sources on the Internet which rank the most popular languages, based on very different criteria and formulas. What is sure is that there is no silver bullet, no one-size-fits-all programming language.  
The one I used here is the IEEE's one based on 11 metrics, [explained here](https://spectrum.ieee.org/ieee-top-programming-languages-design-methods-and-data-sources).
Let's see IEEE's rankings for years 2018-2021:

| Language | 2018 | 2019 | 2020 | 2021 |
| :- | :-: | :-: | :-: | :-: |
| Python | 100 | 100 | 100 | 100 |
| Java | 97.5 | 96.3 | 95.3 | 95.4 |
| C++ | 99.7 | 87.5 | 87 | 92.4 |
| C | 96.7 | 94.4 | 94.6 | 94.7 |
| C# | 89.4 | 74.5 | NTT | 82.4 |
| JavaScript | NTT | NTT | 79.5 | 88.1 |



<a id='more'></a>

## Some more numbers
Thanks to `Stack Overflow` the biggest online tool for learning and sharing IT knowledge, which provide this statistical tool [Trends](https://insights.stackoverflow.com/trends?tags=r%2Cstatistics) about trends, anybody can search for trending keywords in their database.<br>
If we try the same six programming languages we have this:<br><br>
<span style=''><img style='background: rgb(128, 128, 128, .15); align: left; display: inline-block; padding: 20px' src='img/python-java.png'/></span>  

### Maybe more from Data Science community?
But maybe to compare Python, a generic, multi-purpose, programming language, with some more fine tunned and specialized languages used in data science, as matlab, R, Julia, etc. is not an apple to apple comparation.  
You can find more about the today status from different, more specialized sources. Following are just few of them:  
  <br>
- __Kaggle__  
Inside Kaggle you’ll find all the code & data you need to do your data science work. Use over 50,000 public datasets and 400,000 public notebooks to conquer any analysis in no time.  
https://www.kaggle.com/  
  <br>  
  
- __Datacamp__  
Data drives everything. Get the skills you need for the future of work.  
https://www.datacamp.com/
  <br>  
  
- __QuantEcon__  
Open source code for economic modeling.  
https://quantecon.org/

<a id='features'></a>

## Features
- Python is a `high-level language` suitable `for rapid development`.
- It has a `relatively small core` language supported by `many libraries` (called packages, each containing one or more modules).
- `Multiple programming styles` are supported (procedural, object-oriented, functional, etc.)  
- It is `interpreted` rather than compiled.

<a id='syntax'></a>

## Syntax and Design
- Python has an ``elegant syntax``, it makes the syntax easy to read and easy to remember.
- Closely related to elegant syntax is an `elegant design`: features like iterators, generators, decorators and list comprehensions make Python `highly expressive`, allowing you to get `more done with less code`.

<a id='common'></a>

## Common Uses
- Python is very beginner-friendly and is often used to `teach computer science and programming`.
- Python is a `general-purpose language` used in almost all application domains such as: communications, web development, CGI and graphical user interfaces, game development, multimedia, data processing, security, etc..
- `Used` extensively `by` Internet services and `high tech companies` including: Google, Dropbox, Reddit, YouTube, Walt Disney Animation
- Python is particularly `popular within the scientific community` with users including NASA, CERN and practically all branches of academia.
- It is also `replacing` familiar tools like `Excel` in the fields of `finance and banking`.


### Scientific Programming
Python has become one of the core languages of scientific computing.
It’s either the dominant player or a major player in:
- machine learning and data science: http://scikit-learn.org/stable/
- astronomy: http://www.astropy.org/
- artificial intelligence: https://wiki.python.org/moin/PythonForArtificialIntelligence
- chemistry: http://chemlab.github.io/chemlab/
- computational biology: http://biopython.org/wiki/Main_Page
- meteorology: https://pypi.org/project/meteorology/
- economics: https://quantecon.org/

### Numerical Programming
Fundamental matrix and array processing capabilities are provided by the excellent `NumPy` library. NumPy provides the basic array data type plus some simple processing operations.
The `SciPy` library is built on top of NumPy and provides additional functionality. SciPy includes many of the standard routines used in:
linear algebra, integration, interpolation, optimization, distributions and random number generation, signal processing.

### Graphics
The most popular and comprehensive Python library for creating figures and graphs is `Matplotlib`, with functionality including:
- plots, histograms, contour images, 3D graphs, bar charts etc.  
- output in many formats (PDF, PNG, EPS, etc.)  
- LaTeX integration

Other graphics libraries include: `Seaborn`, `Plotly`, `Bokeh`, `VPython`.

### Statistics
Python’s data manipulation and statistics libraries have improved rapidly over the last few years.
One of the most popular libraries for working with data is Pandas. Pandas is fast, efficient, flexible and well designed.
Other Useful Statistics Libraries:
- `statsmodels` — various statistical routines  
- `scikit-learn` — machine learning in Python (sponsored by Google, among others)  
- `pyMC` — for Bayesian data analysis  
- `pystan` — Bayesian analysis based on [stan](http://mc-stan.org/) 

### Networks and Graphs
Python has many libraries for studying graphs.
One well-known example is `NetworkX`.
Its features include, among many other things:
- standard graph algorithms for analyzing networks  
- plotting routines


<!--NAVIGATION-->
<span style='background: rgb(128, 128, 128, .15); width: 100%; display: block; padding: 10px 0 10px 10px'>< [Welcome](00.00-Welcome.ipynb) | [Contents](00.00-Index.ipynb) | [Installing Python & Tools](01.02-Installing-Python.ipynb) > [Top](#top) ^ </span>

<span style='background: rgb(128, 128, 128, .15); width: 100%; display: block; padding: 10px 0 10px 10px'>This is the Jupyter notebook version of the __Python for Official Statistics__ produced by Eurostat; the content is available [on GitHub](https://github.com/eurostat/e-learning/tree/main/python-official-statistics).
<br>The text and code are released under the [EUPL-1.2 license](https://github.com/eurostat/e-learning/blob/main/LICENSE).</span>