# A Python short course on Atmospheric Data Analysis - Week 2  

This Python tutorial was written in June 2024 by Ludving Cano, Research Assistant at the [Laboratory for Atmospheric Physics](http://www.chacaltaya.edu.bo) - UMSA (lcano@chacaltaya.edu.bo). It shows the basic I/O commands, opening dataset and start formatting data into Pandas dataframes.

On **week 2** we will cover:

 - Usage of Paths in Python
 - Open and read files
 - Formatting data
 - `Pandas I`
   - Opening a simple dataset
   - Basic commands with pandas
   - Opening a more complicated dataset
   - Basic statistics with Pandas

## Libraries that we will use

For this week we will start using libraries, depending on what do you want to do it's better to use or another. At the moment we will use:

 - [OS](https://docs.python.org/3/library/os.html) is a library used for general things that involve the system itself (your computer), we will use it mostly for opening and searching for files.
 - [Pathlib](https://docs.python.org/3/library/pathlib.html) is a library that offers some advantages while working with files, we will learn some things in parallel in OS.
 - [Pandas](https://pandas.pydata.org/) it's going to be our main library from now on, it offers A LOT of advantages when working with tables (or what we will call now, _dataframes_).
 - [Numpy](https://numpy.org/) it's mostly a numerical library, it offers advantages with numerical processing and calculations.
 
I added a link to its main documentation in each bullet. Just to a click on it!

# 1. Paths
What's a path? It a route (in spanish _ruta_), it can tell us where a file is located and from there we can try to reach it. Paths are a combination of directories and subdirectories (and at the end the file itself) and they are usually separated by a backslash (\\) or forward slash (/).

There are two main types: 

 - Absolute path: It's where a file is located from the ROOT (or the origin) of the computer, the advantage is that even if we are accessing from a different directory each time, we can get the file path, the disadvantage is that if we change of computer we have to change our absolute path.
 - Relative path: It's where a file is located from where YOU are right now, for example, if we want to access any of our data we simply go to `data/`, the disadvantange is that if we change of directory from where we are executing our code, we have to change the relative path.
 

If you want to see it this way, the absolute path is the current directory + the relative path.

For example if you are on Jupyter lab you can generate the absolute path by doing a right click on a file, and from VSCode you have the option for both.

<b><font color="green" size=5>Example 1: Where are you from?</font></b>


As said before, normally from where you are executing your code (a Python script for example) is called the  _current working directory_. I'll show how to show it in two ways, with Pathlib and OS.

In [1]:
## Let's not forget to import our libraries
import os
from pathlib import Path #--We will only use Path, not the entire library

In [2]:
os.getcwd()

'/home/ludving/LFA/LFA-python-short-course'

In [3]:
Path.cwd()

PosixPath('/home/ludving/LFA/LFA-python-short-course')

As you can see, they practically show the same thing, it's where you are running your code from. The unique thing that changes is that when we use Path, we get an object called PosixPath (or this can change in Windows).

Anyway, for the first method, `os.getcwd()`, we can see that is a string:

In [6]:
# Write your code for showing which type of variable is os.getcwd()



### Writing simple paths
As you can see, we can generate to whatever we want by just defining a string with its content. For example:

In [7]:
path_test = 'data_samples/test1.txt'

### Knowing if something exists `os.path.exists()`
Sometimes we want to know if a certain file exists or not (this is useful in cases when you work with well-formatted files, for example when the date is the name of the file). Then we want to know whether we can open this or not. If we try to open something that doesn't exist, the code will raise an error, so before opening it it's good to know this. 

In [9]:
does_exist = os.path.exists(path_test)
does_exist

False

For now we will stay here, and if needed we will learn more things for `os` and `path`. Let's open our first file!