<img src="../images/logo_VORTEX.png" width="200" height="auto" alt="Company Logo">

| Project| Authors           | Company                                 | Year | Chapter |
|--------|-------------------|-----------------------------------------|------|---------|
| Pywind | Oriol L & Arnau T | [Vortex FdC](https://www.vortexfdc.com) | 2024 | 2       |

# Chapter 2: Loading Txt

_Overview:_

This script reads meteorological **Text data (.txt)** and uses functions to load and show the file structure and a quick overview.

- Measurements (txt) - Contains single height with limited variables.
- Vortex (txt) - Contains single heights and variables.

_Data Storage:_

The acquired data is stored in two data structures for comparison and analysis:
- Xarray Dataset
- Pandas DataFrame

_Objective:_

- To understand the variance in data storage when using Xarray and Pandas.
- Utilize the basic commands to make a quick overview of the loaded data; e.g. `describe()` and `head()`.
- Define functions in external files.

### Import Libraries

In [25]:
import sys
import os

sys.path.append(os.path.join(os.getcwd(), '../examples'))

from typing import Dict
from example_2_read_txt_functions import *
#from examples import example_2_read_txt_functions import _get_coordinates_vortex_header


### Define Paths and Site

In [26]:
SITE = 'froya'
pwd = os.getcwd()
base_path = str(os.path.join(pwd, '../data'))

print()
measurements_netcdf = os.path.join(base_path, f'{SITE}/measurements/obs.nc')
vortex_netcdf = os.path.join(base_path, f'{SITE}/vortex/SERIE/vortex.serie.era5.utc0.nc')

vortex_txt = os.path.join(base_path, f'{SITE}/vortex/SERIE/vortex.serie.era5.utc0.100m.txt')
measurements_txt = os.path.join(base_path, f'{SITE}/measurements/obs.txt')

# Print filenames
print('Measurements txt: ', measurements_txt)
print('Vortex txt: ', vortex_txt)


Measurements txt:  /home/oriol/vortex/git/pywind_private/notebooks/../data/froya/measurements/obs.txt
Vortex txt:  /home/oriol/vortex/git/pywind_private/notebooks/../data/froya/vortex/SERIE/vortex.serie.era5.utc0.100m.txt


### Read Vortex Text Series Functions

In [27]:
ds_vortex = read_vortex_serie(vortex_txt)
ds_vortex

  df: pd.DataFrame = pd.read_csv(infile, **readcsv_kwargs)


Now, we convert *df_vortex* to Pandas DataFrame and we use `head()` Pandas function to display the first 5 rows.

In [28]:
df_vortex = ds_vortex.to_dataframe() 
df_vortex.head()

Unnamed: 0_level_0,lat,lon,lev,utc,M,Dir,T,D,P,RI,RH,RMOL
time,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1
2004-01-01 00:00:00,63.66639,8.342499,100.0,0.0,4.6,154,-1.8,1.29,1005.5,-0.75,67.0,-0.1985
2004-01-01 01:00:00,63.66639,8.342499,100.0,0.0,4.3,162,-1.7,1.29,1005.2,-1.48,64.9,-0.2098
2004-01-01 02:00:00,63.66639,8.342499,100.0,0.0,4.9,166,-1.7,1.29,1005.2,-3.55,64.6,-0.1699
2004-01-01 03:00:00,63.66639,8.342499,100.0,0.0,4.7,156,-1.8,1.29,1004.7,-2.01,63.6,-0.174
2004-01-01 04:00:00,63.66639,8.342499,100.0,0.0,4.7,145,-1.9,1.29,1004.4,-1.03,62.2,-0.1747


And if we only want to print 'M' and 'Dir' columns:

In [29]:
df_vortex[['M', 'Dir']].head()

Unnamed: 0_level_0,M,Dir
time,Unnamed: 1_level_1,Unnamed: 2_level_1
2004-01-01 00:00:00,4.6,154
2004-01-01 01:00:00,4.3,162
2004-01-01 02:00:00,4.9,166
2004-01-01 03:00:00,4.7,156
2004-01-01 04:00:00,4.7,145


### Read Measurements Txt

In [30]:
df_obs = read_vortex_obs_to_dataframe(measurements_txt)
ds_obs = convert_to_xarray(df_obs)
df_obs.head()

  df: pd.DataFrame = pd.read_csv(infile, **readcsv_kwargs)


Unnamed: 0_level_0,M,Dir
time,Unnamed: 1_level_1,Unnamed: 2_level_1
2009-11-18 13:50:00,3.62,159.0
2009-11-18 14:00:00,3.46,153.0
2009-11-18 14:10:00,2.99,153.0
2009-11-18 14:20:00,2.41,151.5
2009-11-18 14:30:00,1.91,150.5


### Now we can compare statistics

In [31]:
from IPython.display import display

display(df_vortex[['M', 'Dir']].describe().round(2))
display(df_obs.describe().round(2))

Unnamed: 0,M,Dir
count,176424.0,176424.0
mean,8.43,183.1
std,4.77,89.74
min,0.1,0.0
25%,4.9,111.0
50%,7.6,194.0
75%,11.0,248.0
max,34.7,360.0


Unnamed: 0,M,Dir
count,171895.0,171895.0
mean,8.06,178.48
std,4.78,95.98
min,0.09,0.0
25%,4.57,93.0
50%,7.04,197.0
75%,10.6,251.0
max,36.1,360.0


### Thank you for completing this Notebook! 
### *Other references available upon request.*

You now can:

- Read Vortex SERIES txt files.
- Convert from txt to NetCD.
- Convert to **Pandas** DataFrames.
- Have a quick overview of the data using `head()` and `describe()` Pandas functions.

**Don't hesitate to [contact us](https://vortexfdc.com/contact/) for any questions and information.**

## Change Log


| Date (YYYY-MM-DD) | Version | Changed By | Change Description                         |
|-------------------|---------|------------|--------------------------------------------|
| 2024-06-25        | 0.0     | Arnau      | Notebook creation                          |

<hr>

## <h3 align="center"> © Vortex F.d.C. 2024. All rights reserved. <h3/>