# SoftPython

**Introductive guide to coding, data cleaning and analysis for Python 3, with many worked exercises.**

<div class="alert alert-warning">
    
**WARNING: THIS ENGLISH VERSION IS IN-PROGRESS**
    
Completion is due by end of 2021
    
Complete Italian version is here:  [it.softpython.org](https://it.softpython.org)
</div>

Nowadays, more and more decisions are taken upon factual and objective data. All disciplines, from engineering to social sciences, require to elaborate data and extract actionable information by analysing heterogenous sources. This book of practical exercises gives an introduction to coding and data processing using [Python](https://www.python.org), a programming language popular both in the industry and in research environments.

## News

**September 22, 2021**:

- major update, added new exercises and pages
- added [worked projects](#C---Worked-projects) section

**October 3, 2020**: updated [References](references.ipynb) page

Old news: [link](changelog.ipynb)

## Intended audience

This book can be useful for both novices who never really programmed before, and for students with more techical background, who a desire to know about about data extraction, cleaning, analysis and visualization (among used frameworks there are Pandas, Numpy and Jupyter editor). We will try to process data in a practical way, without delving into more advanced considerations about algorithmic complexity and data structures. To overcome issues and guarantee concrete didactical results, we will present step-by-step tutorials. 

## Contents

* [Overview](overview.ipynb): Approach and goals
    
### A - Foundations

<h3 id="foundations"></h3>

1.  [Installation](installation.ipynb)
1.  [Tools and scripts](tools/tools-sol.ipynb)

### A.1 Data types

<h3 id="data-types"></h3> <h3 id="basics"></h3> <h3 id="strings"></h3><h3 id="tuples"></h3> <h3 id="sets"></h3><h3 id="dictionaries"></h3>

1.  Basics: [1. variables and integers](basics/basics1-ints-sol.ipynb) &nbsp;&nbsp;[2. booleans](basics/basics2-bools-sol.ipynb) &nbsp;&nbsp;[3. real numbers](basics/basics3-floats-sol.ipynb) [4. challenges](basics/basics4-chal.ipynb)

1.  Strings: &nbsp;&nbsp;[1. intro](strings/strings1-sol.ipynb) &nbsp;&nbsp;[2. operators](strings/strings2-sol.ipynb) &nbsp;&nbsp;[3. basic methods](strings/strings3-sol.ipynb) &nbsp;&nbsp;[4. search methods](strings/strings4-sol.ipynb)&nbsp;&nbsp; <!--[5. challenges](strings/strings5-chal.ipynb)-->
    
1.  Lists: &nbsp;&nbsp;[1. intro](lists/lists1-sol.ipynb) &nbsp;&nbsp;[2. operators](lists/lists2-sol.ipynb) &nbsp;&nbsp;[3. basic methods](lists/lists3-sol.ipynb) &nbsp;&nbsp;[4. search methods](lists/lists4-sol.ipynb) &nbsp;&nbsp;<!--[5. challenges](lists/lists5-chal.ipynb)-->
    
1.  Tuples: &nbsp;&nbsp;[1. intro](tuples/tuples1-sol.ipynb) &nbsp;&nbsp;<!--[2. challenges](tuples/tuples2-chal.ipynb)-->

1.  Sets: &nbsp;&nbsp;[1. intro](sets/sets1-sol.ipynb) &nbsp;&nbsp;<!--[2. challenges](sets/sets2-chal.ipynb)-->

1.  Dictionaries: &nbsp;&nbsp;[1. intro](dictionaries/dictionaries1-sol.ipynb) &nbsp;&nbsp;[2. operators](dictionaries/dictionaries2-sol.ipynb) &nbsp;&nbsp;[3. methods](dictionaries/dictionaries3-sol.ipynb) &nbsp;&nbsp;[4. special classes ](dictionaries/dictionaries4-sol.ipynb) &nbsp;&nbsp;<!--[5. challenges](dictionaries/dictionaries5-chal.ipynb)-->

### A.2 Control flow

<h3 id="control-flow"></h3> <h3 id="if"></h3> <h3 id="for"></h3> <h3 id="while"></h3><h3 id="sequences"></h3>

1.  If conditionals: &nbsp;&nbsp;[1.intro](if/if1-sol.ipynb) &nbsp;&nbsp;<!--[2. challenges](if/if2-chal.ipynb)  -->
    
1.  For loops: &nbsp;&nbsp;[1. intro](for/for1-intro-sol.ipynb) &nbsp;&nbsp;[2. strings](for/for2-strings-sol.ipynb) &nbsp;&nbsp;[3. lists](for/for3-lists-sol.ipynb) &nbsp;&nbsp;[4. tuples](for/for4-tuples-sol.ipynb) &nbsp;&nbsp;[5. sets](for/for5-sets-sol.ipynb) &nbsp;&nbsp;[6. dictionaries](for/for6-dictionaries-sol.ipynb) 

    &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[7. nested for](for/for7-nested-sol.ipynb) &nbsp;&nbsp;<!--[8. challenges](for/for8-chal.ipynb)-->

1.  While loops &nbsp;&nbsp;[1. intro](while/while1-sol.ipynb) &nbsp;&nbsp;<!--[2. challenges](while/while2-chal.ipynb)-->
1.  Sequences and comprehensions: &nbsp;&nbsp;[1. intro](sequences/sequences1-sol.ipynb) &nbsp;&nbsp;<!--[1. challenges](sequences/sequences2-chal.ipynb)-->

### A.3 Algorithms

<h3 id="algorithms"></h3><h3 id="functions"></h3><h3 id="matrices-lists"></h3></h3><h3 id="mixed-structures"></h3><h3 id="matrices-numpy"></h3>

1.  Functions: &nbsp;&nbsp;[1. intro](functions/fun1-intro-sol.ipynb) &nbsp;&nbsp;[2. error handling and testing](functions/fun2-errors-and-testing-sol.ipynb)
<!--
    &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[3. strings](functions/fun3-strings-sol.ipynb) &nbsp;&nbsp;[4. lists](functions/fun4-lists-sol.ipynb) &nbsp;&nbsp;[5. tuples](functions/fun5-tuples-sol.ipynb) [6. sets](functions/fun6-sets-sol.ipynb)&nbsp;&nbsp;[7. dictionaries](functions/fun7-dictionaries-sol.ipynb) &nbsp;&nbsp;[8. challenges](functions/fun8-chal.ipynb)-->
    
1.  Matrices - list of lists: &nbsp;&nbsp;[1. intro](matrices-lists/matrices-lists1-sol.ipynb) &nbsp;&nbsp;[2. other exercises](matrices-lists/matrices-lists2-sol.ipynb) &nbsp;&nbsp;<!--[3. challenges](matrices-lists/matrices-lists3-chal.ipynb)-->

1.  Mixed structures: &nbsp;&nbsp;[1. intro](mixed-structures/mixed-structures1-sol.ipynb) &nbsp;&nbsp;<!--[2. challenges](mixed-structures/mixed-structures2-chal.ipynb)-->

1.  Matrices - numpy: &nbsp;&nbsp;[1. intro](matrices-numpy/matrices-numpy1-sol.ipynb) &nbsp;&nbsp;[2. exercises](matrices-numpy/matrices-numpy2-sol.ipynb) &nbsp;&nbsp;<!--[3. challenges](matrices-numpy/matrices-numpy3-chal.ipynb)-->

### B - Data analysis

<h3 id="data-analysis"></h3><h3 id="formats"></h3><h3 id="visualization"></h3><h3 id="pandas"></h3><h3 id="binary-relations"></h3>

1.  Data formats: &nbsp;&nbsp;[1. line files](formats/formats1-lines-sol.ipynb) &nbsp;&nbsp;[2. CSV files](formats/formats2-csv-sol.ipynb) &nbsp;&nbsp;[3. JSON files](formats/formats3-json-sol.ipynb) &nbsp;&nbsp;[4. graph formats](formats/formats4-graph-sol.ipynb) &nbsp;&nbsp;
    
   <!-- &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[5. challenges](formats/formats5-chal.ipynb)-->
    
1. Visualization: &nbsp;&nbsp; [1. intro](visualization/visualization1-sol.ipynb) &nbsp;&nbsp; [challenges](visualization/visualization2-chal.ipynb) &nbsp;&nbsp; [images](visualization/visualization-images-sol.ipynb)
1. Analytics with Pandas: [1. intro](pandas/pandas1-sol.ipynb) &nbsp;&nbsp; [2. exercises](pandas/pandas2-sol.ipynb) &nbsp;&nbsp; [3. challenge](pandas/pandas3-chal.ipynb)
1. [Binary relations](binary-relations/binary-relations-sol.ipynb)    

### C - Applications

<h3 id="applications"></h3>

### D - Worked projects

<h3 id="worked-projects"></h3>

<!--
Projects as exercises (with solution), involving some raw data preprocessing, simple analysis and final chart display. 
-->

### E - Appendix

<h3 id="appendix"></h3>

* [Commandments](commandments.ipynb)
* [References](references.ipynb)

## Author
 
**David Leoni**: Software engineer specialized in data integration and semantic web, has made applications in open data and medical in Italy and abroad. He frequently collaborates with University of Trento for teaching activities in various departments. Since 2019 is president of CoderDolomiti Association, where along with Marco Caresia manages volunteering movement CoderDojo Trento to teach creative coding to kids. <br/>
Email: [david.leoni@unitn.it](mailto:david.leoni@unitn.it) &ensp; Website: [davidleoni.it](https://davidleoni.it)

### Contributors

**Marco Caresia** (2017 Autumn Edition assistent @DISI, University of Trento): He has been informatics teacher at Scuola Professionale Einaudi of Bolzano. He is president of the Trentino Alto Adige Südtirol delegatioon of the Associazione Italiana Formatori and vicepresident of CoderDolomiti Association.

**Alessio Zamboni** (2018 March Edition assistent @Sociology Department, University of Trento): Data scientist and software engineer with experience in NLP, GIS and knowledge management. Has collaborated to numerous research projects, collecting experinces in Europe and Asia. He strongly believes that _'Programming is a work of art'_.

**Massimiliano Luca** (2019 summer edition teacher @Sociology Department, University of Trento): Loves learning new technilogies each day. Particularly interested in knowledge representation, data integration, data modeling and computational social science. Firmly believes it is vital to introduce youngsters to computer science, and has been mentoring at Coder Dojo DISI Master.

## License

The making of this website and related courses was funded mainly by [Department of Information Engineering and Computer Science (DISI)](https://www.disi.unitn.it), University of Trento, and also [Sociology](https://www.sociologia.unitn.it/en) and [Mathematics](https://www.maths.unitn.it/en) departments.

![unitn-843724](_static/img/third-parties/disi-unitn-en-logo-468-153.png)


![cc-by-7172829](_static/img/cc-by.png)

All the material in this website is distributed with license CC-BY 4.0 International Attribution [https://creativecommons.org/licenses/by/4.0/deed.en](https://creativecommons.org/licenses/by/4.0/deed.en) 

Basically, you can freely redistribute and modify the content, just remember to cite University of Trento and [the authors](https://en.softpython.org/index.html#Author) 

Technical notes: all website pages are easily modifiable Jupyter notebooks, that were  converted to web pages using [NBSphinx](https://nbsphinx.readthedocs.io) using template [Jupman](https://github.com/DavidLeoni/jupman). Text sources are on Github at address  [https://github.com/DavidLeoni/softpython-en](https://github.com/DavidLeoni/softpython-en)


## Acknowledgments

We thank in particular professor Alberto Montresor of Department of Information Engineering and Computer Science, University of Trento to have allowed the making of first courses from which this material was born from, and the project Trentino Open Data ([dati.trentino.it](https://dati.trentino.it)) for the numerous datasets provided.

![dati-trentino-9327234823487](_static/img/third-parties/dati-trentino-small.png)

Other numerous intitutions and companies that over time contributed material and ideas are cited [in this page](thanks.ipynb)