# Lab 2.1 - Weather Data Around Winona

In this lab, we will download and combine a decades worth of weather data from the NOAA, focusing on weather stations within 500 miles of Winona.

Here is the outline of the basic process.

1. Install and investigate useful packages.
2. Find all weather stations in proximity to Winona.
3. Use a single station to prototype our tools.
4. Automate the process of downloading and uncompressing data from all stations of interest.
5. Output the results to a CSV file.

## Problem 1 - Install and investigate useful tools.

First, you should install and investigate the following tools.

1. **`wget`** is a tool for programmically downloading data files from the web on the command line.  There is a Python wrapper to this tool that you can install with `pip` as shown below.
2. **`geopy`** is a package that, among other things, implements a function for computing distances between two lat-long pairs. Again, install this package with `pip` as shown below.
3. **`gzip`** is part of the standard Python library and

In [1]:
%pip install wget

Note: you may need to restart the kernel to use updated packages.


In [2]:
%pip install geopy

Note: you may need to restart the kernel to use updated packages.


#### Task 1.1 - Investigate using `wget` to download a file.

Read the help/documentation on `wget` to figure out how to download the following data file [Some random data file from STAT 210] into the `./data` sub-folder.

[https://github.com/yardsale8/STAT_210/raw/refs/heads/main/data/sars1.csv](https://github.com/yardsale8/STAT_210/raw/refs/heads/main/data/sars1.csv)

In [4]:
#don't run again! file already in data folder!
import wget

url = 'https://github.com/yardsale8/STAT_210/raw/refs/heads/main/data/sars1.csv'
data = wget.download(url, out = './data')

HTTPError: HTTP Error 504: Gateway Time-out

#### Task 1.2 - Investigate using `geopy.distance.distance` to compute a distance in miles.

1. Import the `distance` function from the `geopy.distance` submodule.
2. Use Wikipedia to find the lat-long coordinates of Winona and Rochester MN.
3. Use `distance` to compute the distance between Winona and Rochester.
4. Use some other source (e.g., Google Maps) to check the answer.

In [14]:
from geopy.distance import distance

winona_lat_long = (44.0499, -91.6393)
rochester_lat_long = (44.0216, -92.4699)

(distance := distance(winona_lat_long, rochester_lat_long).miles)

41.416366055391336

Google maps says it is 45.5 miles from Winona to Rochester (but this is by driving on roads, so you can't drive directly from Winona to Rochester).

#### Task 1.3 - Investigate `gzip`

The yearly NOAA data is compressed as `.gz` files, which need to be uncompressed using `gzip`.  Explore the `gzip` module by

1. Exploring the documentation/help for the `gzip` module,
2. Using `wget` to download the following link into the `./data` folder, and
3. Using `gzip` to uncompress this file.
4. Inspect the data in your list, which should be of type `byte`.  Use a comprehension with the expression `l.decode('utf-8')` to convert this to a list of strings.
5. Write the uncompressed lines to an output file using `with open(path, 'w') as out` and the `writelines` method of `out`.  

**Link.** [https://www.ncei.noaa.gov/pub/data/ghcn/daily/by_year/1750.csv.gz](https://www.ncei.noaa.gov/pub/data/ghcn/daily/by_year/1750.csv.gz)

In [4]:
import gzip

In [None]:
#don't run again! file already in data folder!
url_noaa = 'https://www.ncei.noaa.gov/pub/data/ghcn/daily/by_year/1750.csv.gz'

data_noaa = wget.download(url_noaa, out = './data')

In [7]:
with gzip.open('./data/1750.csv.gz', 'rb') as compressed_file:
    noaa_uncompressed = compressed_file.readlines()
noaa_uncompressed

[b'ASN00002061,17500201,PRCP,56,,,a,\n',
 b'ASN00003014,17500201,PRCP,0,,,a,\n',
 b'ASN00003059,17500201,PRCP,0,,,a,\n',
 b'ASN00003088,17500201,PRCP,0,,,a,\n',
 b'ASN00009015,17500201,PRCP,0,,,a,\n',
 b'ASN00009193,17500201,TMIN,187,,,a,\n',
 b'ASN00009193,17500201,PRCP,0,,,a,\n',
 b'ASN00009500,17500201,DATX,2,,,a,\n',
 b'ASN00009500,17500201,MDTX,210,,,a,\n',
 b'ASN00009592,17500201,DATX,4,,,a,\n',
 b'ASN00009592,17500201,MDTX,278,,,a,\n',
 b'ASN00009752,17500201,DAPR,4,,,a,\n',
 b'ASN00009752,17500201,DWPR,4,,,a,\n',
 b'ASN00009752,17500201,MDPR,8,,,a,\n',
 b'ASN00009784,17500201,DAPR,3,,,a,\n',
 b'ASN00009784,17500201,DWPR,3,,,a,\n',
 b'ASN00009784,17500201,MDPR,30,,,a,\n',
 b'ASN00010006,17500201,PRCP,0,,,a,\n',
 b'ASN00010152,17500201,PRCP,0,,,a,\n',
 b'ASN00010536,17500201,DATX,4,,,a,\n',
 b'ASN00010536,17500201,MDTX,325,,,a,\n',
 b'ASN00010628,17500201,PRCP,0,,,a,\n',
 b'ASN00010655,17500201,PRCP,0,,,a,\n',
 b'ASN00010729,17500201,DAPR,3,,,a,\n',
 b'ASN00010729,17500201,DWPR,3

In [9]:
(noaa_strings := [l.decode('utf-8') for l in noaa_uncompressed])

['ASN00002061,17500201,PRCP,56,,,a,\n',
 'ASN00003014,17500201,PRCP,0,,,a,\n',
 'ASN00003059,17500201,PRCP,0,,,a,\n',
 'ASN00003088,17500201,PRCP,0,,,a,\n',
 'ASN00009015,17500201,PRCP,0,,,a,\n',
 'ASN00009193,17500201,TMIN,187,,,a,\n',
 'ASN00009193,17500201,PRCP,0,,,a,\n',
 'ASN00009500,17500201,DATX,2,,,a,\n',
 'ASN00009500,17500201,MDTX,210,,,a,\n',
 'ASN00009592,17500201,DATX,4,,,a,\n',
 'ASN00009592,17500201,MDTX,278,,,a,\n',
 'ASN00009752,17500201,DAPR,4,,,a,\n',
 'ASN00009752,17500201,DWPR,4,,,a,\n',
 'ASN00009752,17500201,MDPR,8,,,a,\n',
 'ASN00009784,17500201,DAPR,3,,,a,\n',
 'ASN00009784,17500201,DWPR,3,,,a,\n',
 'ASN00009784,17500201,MDPR,30,,,a,\n',
 'ASN00010006,17500201,PRCP,0,,,a,\n',
 'ASN00010152,17500201,PRCP,0,,,a,\n',
 'ASN00010536,17500201,DATX,4,,,a,\n',
 'ASN00010536,17500201,MDTX,325,,,a,\n',
 'ASN00010628,17500201,PRCP,0,,,a,\n',
 'ASN00010655,17500201,PRCP,0,,,a,\n',
 'ASN00010729,17500201,DAPR,3,,,a,\n',
 'ASN00010729,17500201,DWPR,3,,,a,\n',
 'ASN00010729,1

In [10]:
with open('./data/1750.csv', 'w') as out:
    out.writelines(noaa_strings)

## Problem 2 - Find all stations within **25** miles of Winona, MN.

The file linked below contains information about all stations tracked by NOAA.  

*Main folder:* https://www.ncei.noaa.gov/pub/data/ghcn/daily/

*Station txt file:* https://www.ncei.noaa.gov/pub/data/ghcn/daily/ghcnd-stations.txt

*Note.* While it would be easier to use the CSV version of the station file, you should use the TXT version here (for practice).

**Your tasks** Our goal is to get a list of stations that are within 500 miles of Winona.  Do this by

1. Using `wget` to download the stations information into the `./data` folder.
2. Use `with` to read the lines of this file.
3. At this point, the lines are strings in a fixed-width format separated by whitespace.  Use a list comprehension with the string split method to split the raw lines (strings) into a list of entries.
4. There are three entries of interest, the station ID and the lat-long coordinates of the station.  Inspect the file to determine the index for these three entries.
5. We want to transform the lines (currently a list of strings) into a record, which is a `dict` with good names for the entries as keys and the values representing the data in an appropriate type (string for station ID, `float` for the lat-long).  Use a comprehension to create a list of records as described.
6. Use another comprehension to apply a filter to the stations, keeping only those within 500 miles of Winona.

In [None]:
#don't run again! file already in data folder!
stations_txt_url = 'https://www.ncei.noaa.gov/pub/data/ghcn/daily/ghcnd-stations.txt'

data = wget.download(stations_txt_url, out = './data')

In [15]:
with open('./data/ghcnd-stations.txt', 'r') as file:
    stations = file.readlines()
stations

['ACW00011604  17.1167  -61.7833   10.1    ST JOHNS COOLIDGE FLD                       \n',
 'ACW00011647  17.1333  -61.7833   19.2    ST JOHNS                                    \n',
 'AE000041196  25.3330   55.5170   34.0    SHARJAH INTER. AIRP            GSN     41196\n',
 'AEM00041194  25.2550   55.3640   10.4    DUBAI INTL                             41194\n',
 'AEM00041217  24.4330   54.6510   26.8    ABU DHABI INTL                         41217\n',
 'AEM00041218  24.2620   55.6090  264.9    AL AIN INTL                            41218\n',
 'AF000040930  35.3170   69.0170 3366.0    NORTH-SALANG                   GSN     40930\n',
 'AFM00040938  34.2100   62.2280  977.2    HERAT                                  40938\n',
 'AFM00040948  34.5660   69.2120 1791.3    KABUL INTL                             40948\n',
 'AFM00040990  31.5000   65.8500 1010.0    KANDAHAR AIRPORT                       40990\n',
 'AG000060390  36.7167    3.2500   24.0    ALGER-DAR EL BEIDA             GSN   

In [16]:
(split_data := [line.strip().split() for line in stations])
#stations[0] is the ID
#stations[1] is the lat
#stations[2] is the long

[['ACW00011604',
  '17.1167',
  '-61.7833',
  '10.1',
  'ST',
  'JOHNS',
  'COOLIDGE',
  'FLD'],
 ['ACW00011647', '17.1333', '-61.7833', '19.2', 'ST', 'JOHNS'],
 ['AE000041196',
  '25.3330',
  '55.5170',
  '34.0',
  'SHARJAH',
  'INTER.',
  'AIRP',
  'GSN',
  '41196'],
 ['AEM00041194', '25.2550', '55.3640', '10.4', 'DUBAI', 'INTL', '41194'],
 ['AEM00041217',
  '24.4330',
  '54.6510',
  '26.8',
  'ABU',
  'DHABI',
  'INTL',
  '41217'],
 ['AEM00041218', '24.2620', '55.6090', '264.9', 'AL', 'AIN', 'INTL', '41218'],
 ['AF000040930',
  '35.3170',
  '69.0170',
  '3366.0',
  'NORTH-SALANG',
  'GSN',
  '40930'],
 ['AFM00040938', '34.2100', '62.2280', '977.2', 'HERAT', '40938'],
 ['AFM00040948', '34.5660', '69.2120', '1791.3', 'KABUL', 'INTL', '40948'],
 ['AFM00040990',
  '31.5000',
  '65.8500',
  '1010.0',
  'KANDAHAR',
  'AIRPORT',
  '40990'],
 ['AG000060390',
  '36.7167',
  '3.2500',
  '24.0',
  'ALGER-DAR',
  'EL',
  'BEIDA',
  'GSN',
  '60390'],
 ['AG000060590', '30.5667', '2.8667', '397.0

In [17]:
(stations_dict := [{'Station ID' : index[0],
                  'Latitude' : float(index[1]),
                  'Longitude' : float(index[2])
                 }
                for index in split_data
                ]
)

[{'Station ID': 'ACW00011604', 'Latitude': 17.1167, 'Longitude': -61.7833},
 {'Station ID': 'ACW00011647', 'Latitude': 17.1333, 'Longitude': -61.7833},
 {'Station ID': 'AE000041196', 'Latitude': 25.333, 'Longitude': 55.517},
 {'Station ID': 'AEM00041194', 'Latitude': 25.255, 'Longitude': 55.364},
 {'Station ID': 'AEM00041217', 'Latitude': 24.433, 'Longitude': 54.651},
 {'Station ID': 'AEM00041218', 'Latitude': 24.262, 'Longitude': 55.609},
 {'Station ID': 'AF000040930', 'Latitude': 35.317, 'Longitude': 69.017},
 {'Station ID': 'AFM00040938', 'Latitude': 34.21, 'Longitude': 62.228},
 {'Station ID': 'AFM00040948', 'Latitude': 34.566, 'Longitude': 69.212},
 {'Station ID': 'AFM00040990', 'Latitude': 31.5, 'Longitude': 65.85},
 {'Station ID': 'AG000060390', 'Latitude': 36.7167, 'Longitude': 3.25},
 {'Station ID': 'AG000060590', 'Latitude': 30.5667, 'Longitude': 2.8667},
 {'Station ID': 'AG000060611', 'Latitude': 28.05, 'Longitude': 9.6331},
 {'Station ID': 'AG000060680', 'Latitude': 22.8, '

In [23]:
from geopy.distance import distance

(stations_within_25 := [station for station in stations_dict
                   if distance(winona_lat_long, (station['Latitude'], station['Longitude'])).miles <= 25
])

[{'Station ID': 'US1MNHS0001', 'Latitude': 43.835, 'Longitude': -91.314},
 {'Station ID': 'US1MNHS0006', 'Latitude': 43.742, 'Longitude': -91.4369},
 {'Station ID': 'US1MNHS0007', 'Latitude': 43.8349, 'Longitude': -91.3138},
 {'Station ID': 'US1MNHS0008', 'Latitude': 43.8381, 'Longitude': -91.3079},
 {'Station ID': 'US1MNHS0009', 'Latitude': 43.8387, 'Longitude': -91.3044},
 {'Station ID': 'US1MNHS0012', 'Latitude': 43.8253, 'Longitude': -91.3209},
 {'Station ID': 'US1MNHS0013', 'Latitude': 43.7817, 'Longitude': -91.3882},
 {'Station ID': 'US1MNHS0022', 'Latitude': 43.7921, 'Longitude': -91.5856},
 {'Station ID': 'US1MNHS0023', 'Latitude': 43.7122, 'Longitude': -91.6541},
 {'Station ID': 'US1MNOL0038', 'Latitude': 44.0762, 'Longitude': -92.0979},
 {'Station ID': 'US1MNOL0991', 'Latitude': 44.0415, 'Longitude': -92.0882},
 {'Station ID': 'US1MNWN0002', 'Latitude': 44.0171, 'Longitude': -91.6098},
 {'Station ID': 'US1MNWN0003', 'Latitude': 43.9858, 'Longitude': -91.8719},
 {'Station ID':

In [24]:
len(stations_within_25)

61

#### Problem 3 - Prototype downloading and uncompressing a station file.

Before we download and uncompress all the stations of interest, let's practice on one station file.


1. Copy the url for some station and store is as a variable named `url`.
2. Write `lambda` functions that extract each of the following from the station `url`: compressed file name, compressed file path (e.g., `./data/...`), and uncompressed file path (e.g., `./data/...`).
3. Write a `lambda` function that extracts
4. Use `wget` to download this stations data.
5. Use `gzip` to uncompress the data.
6. Write the data to out output file.

Your code should have the following shape:

```{Python}
wget.download(...)
with gzip.open(...) as f:
    with open(..., 'w') as out:
        f.readlines()
        out.writelines(f)
```

You should be using your helper functions to, in part, fill in the `...`

In [6]:
url = 'https://www.ncei.noaa.gov/pub/data/ghcn/daily/by_station/ACW00011604.csv.gz'

In [7]:
(compressed_file_name := lambda url: url.split('/')[-1])
#pulls name off of url, last little bit

<function __main__.<lambda>(url)>

In [21]:
compressed_file_name(url)

'ACW00011604.csv.gz'

In [8]:
(compressed_file_path := lambda url: f"./data/{compressed_file_name(url)}")

<function __main__.<lambda>(url)>

In [9]:
(uncompressed_file_path := lambda url: compressed_file_path(url).replace('.gz', ''))

<function __main__.<lambda>(url)>

In [10]:
wget.download(url, out = './data')

'./data/ACW00011604.csv (1).gz'

In [13]:
station_header = """ID, YEAR/MONTH/DAY, ELEMENT, DATA VALUE, M-FLAG, Q-FLAG, S-FLAG, OBS-TIME"""

with gzip.open(compressed_file_path(url)) as f:
    with open(uncompressed_file_path(url), 'w') as out:
        lines = f.readlines()     #list of bytes
        decoded_lines = [l.decode('utf-8') for l in lines]     #list of strings
        out.writelines([station_header] + decoded_lines)

## Problem 4 - Build the station URLs and download the files.

**Tasks.** Now you need to build urls for all stations of interest by

1. Use `with` and `open` to write the header for the output file.  The correct header can be determined using the appropriate readme file from the main folder linked above.
2. Use a comprehension to extract the stations of interest into a list.
3. Investigating the structure of the files stored in the `by_station` folder (see main folder link above).
4. Use a comprehension and an `f` string to build a list of URLS for all stations of interest.
5. Use `wget` to download the data for the stations of interest into the data folder.
6. Use `gzip` to uncompress the files.
7. Convert the `bytes` to `str` of format `utf-8`.
8. Use the append mode `"a"` of `open` with `writelines` to append the data in each file to your output file.

While we usually avoid using a `for` loop, we make an exception for code for lengthy IO.  To accomplish steps 4 & 5, use a `for` loop with the following shape.

```{Python}
with open(..., 'w') as out:
    out.write(header)

for url in station_urls:
    wget.download(...)
    with gzip.open(...) as f:
        with open(..., 'a') as out:
            f.readlines()
            ... # Convert lines to strings here
            out.writelines(...)
    print(f"Downloaded and extracted the data for {url}")
```

Note that the code inside the loop should resemble the code from the previous step.

Also, if you get HTML exceptions for some of the urls, you can use `try/except` to skip them or print information about the problematic stations.  This would involve placing the code for downloading/unzipping/writing the data in the `try` block and printing information about the exceptions in the `except` block, something like shown below.

```{Python}
with open(..., 'w') as out:
    out.write(header)

for url in station_urls:
    try:
        wget.download(...)
        with gzip.open(...) as f:
            with open(..., 'a') as out:
                f.readlines()
                ... # Convert lines to strings here
                out.writelines(...)
        print(f"Downloaded and extracted the data for {url}")
    except e:
        print(f"Probem downloading {url}")
        print(f"Exception: {e}")
```

In [25]:
(station_IDs := [station['Station ID'] for station in stations_within_25])

['US1MNHS0001',
 'US1MNHS0006',
 'US1MNHS0007',
 'US1MNHS0008',
 'US1MNHS0009',
 'US1MNHS0012',
 'US1MNHS0013',
 'US1MNHS0022',
 'US1MNHS0023',
 'US1MNOL0038',
 'US1MNOL0991',
 'US1MNWN0002',
 'US1MNWN0003',
 'US1MNWN0004',
 'US1MNWN0005',
 'US1MNWN0006',
 'US1MNWN0007',
 'US1MNWN0009',
 'US1MNWN0011',
 'US1MNWN0013',
 'US1MNWN0019',
 'US1MNWN0021',
 'US1MNWN0023',
 'US1MNWN0024',
 'US1MNWN0026',
 'US1MNWN0027',
 'US1MNWN0029',
 'US1MNWN0031',
 'US1MNWN0033',
 'US1WIBF0002',
 'US1WILC0003',
 'US1WILC0010',
 'US1WILC0011',
 'US1WILC0018',
 'US1WILC0022',
 'US1WITR0002',
 'US1WITR0003',
 'US1WITR0004',
 'US1WITR0005',
 'US1WITR0008',
 'US1WITR0010',
 'USC00210146',
 'USC00210559',
 'USC00213808',
 'USC00213812',
 'USC00214418',
 'USC00215488',
 'USC00217184',
 'USC00217277',
 'USC00218951',
 'USC00219067',
 'USC00219072',
 'USC00219077',
 'USC00470124',
 'USC00472165',
 'USC00472992',
 'USC00472996',
 'USC00474366',
 'USC00478589',
 'USW00004956',
 'USW00014920']

In the by_station folder, the files are .csv.gz files, meaning that we need to uncompress them!

In [39]:
all_station_urls = lambda station_IDs: [f"https://www.ncei.noaa.gov/pub/data/ghcn/daily/by_station/{station_ID}.csv.gz" for station_ID in station_IDs]
(station_urls := all_station_urls(station_IDs))

['https://www.ncei.noaa.gov/pub/data/ghcn/daily/by_station/US1MNHS0001.csv.gz',
 'https://www.ncei.noaa.gov/pub/data/ghcn/daily/by_station/US1MNHS0006.csv.gz',
 'https://www.ncei.noaa.gov/pub/data/ghcn/daily/by_station/US1MNHS0007.csv.gz',
 'https://www.ncei.noaa.gov/pub/data/ghcn/daily/by_station/US1MNHS0008.csv.gz',
 'https://www.ncei.noaa.gov/pub/data/ghcn/daily/by_station/US1MNHS0009.csv.gz',
 'https://www.ncei.noaa.gov/pub/data/ghcn/daily/by_station/US1MNHS0012.csv.gz',
 'https://www.ncei.noaa.gov/pub/data/ghcn/daily/by_station/US1MNHS0013.csv.gz',
 'https://www.ncei.noaa.gov/pub/data/ghcn/daily/by_station/US1MNHS0022.csv.gz',
 'https://www.ncei.noaa.gov/pub/data/ghcn/daily/by_station/US1MNHS0023.csv.gz',
 'https://www.ncei.noaa.gov/pub/data/ghcn/daily/by_station/US1MNOL0038.csv.gz',
 'https://www.ncei.noaa.gov/pub/data/ghcn/daily/by_station/US1MNOL0991.csv.gz',
 'https://www.ncei.noaa.gov/pub/data/ghcn/daily/by_station/US1MNWN0002.csv.gz',
 'https://www.ncei.noaa.gov/pub/data/ghc

In [41]:
station_header = """ID, YEAR/MONTH/DAY, ELEMENT, DATA VALUE, M-FLAG, Q-FLAG, S-FLAG, OBS-TIME"""
output_file_path = "./data/station_data.csv"

with open(output_file_path, 'w') as out:
    out.write(station_header)

for url in station_urls:
    wget.download(url, out = './data')
    
    with gzip.open(compressed_file_path(url)) as f:
        with open(uncompressed_file_path(url), 'w') as out:
            lines = f.readlines()     #list of bytes
            decoded_lines = [l.decode('utf-8') for l in lines]     #list of strings
            out.writelines(decoded_lines)
    print(f"Downloaded and extraced the data for {url}")

Downloaded and extraced the data for https://www.ncei.noaa.gov/pub/data/ghcn/daily/by_station/US1MNHS0001.csv.gz
Downloaded and extraced the data for https://www.ncei.noaa.gov/pub/data/ghcn/daily/by_station/US1MNHS0006.csv.gz
Downloaded and extraced the data for https://www.ncei.noaa.gov/pub/data/ghcn/daily/by_station/US1MNHS0007.csv.gz
Downloaded and extraced the data for https://www.ncei.noaa.gov/pub/data/ghcn/daily/by_station/US1MNHS0008.csv.gz
Downloaded and extraced the data for https://www.ncei.noaa.gov/pub/data/ghcn/daily/by_station/US1MNHS0009.csv.gz
Downloaded and extraced the data for https://www.ncei.noaa.gov/pub/data/ghcn/daily/by_station/US1MNHS0012.csv.gz
Downloaded and extraced the data for https://www.ncei.noaa.gov/pub/data/ghcn/daily/by_station/US1MNHS0013.csv.gz
Downloaded and extraced the data for https://www.ncei.noaa.gov/pub/data/ghcn/daily/by_station/US1MNHS0022.csv.gz
Downloaded and extraced the data for https://www.ncei.noaa.gov/pub/data/ghcn/daily/by_station/US

In [44]:
(all_uncompressed_station_files := [uncompressed_file_path(url) for url in station_urls])

['./data/US1MNHS0001.csv',
 './data/US1MNHS0006.csv',
 './data/US1MNHS0007.csv',
 './data/US1MNHS0008.csv',
 './data/US1MNHS0009.csv',
 './data/US1MNHS0012.csv',
 './data/US1MNHS0013.csv',
 './data/US1MNHS0022.csv',
 './data/US1MNHS0023.csv',
 './data/US1MNOL0038.csv',
 './data/US1MNOL0991.csv',
 './data/US1MNWN0002.csv',
 './data/US1MNWN0003.csv',
 './data/US1MNWN0004.csv',
 './data/US1MNWN0005.csv',
 './data/US1MNWN0006.csv',
 './data/US1MNWN0007.csv',
 './data/US1MNWN0009.csv',
 './data/US1MNWN0011.csv',
 './data/US1MNWN0013.csv',
 './data/US1MNWN0019.csv',
 './data/US1MNWN0021.csv',
 './data/US1MNWN0023.csv',
 './data/US1MNWN0024.csv',
 './data/US1MNWN0026.csv',
 './data/US1MNWN0027.csv',
 './data/US1MNWN0029.csv',
 './data/US1MNWN0031.csv',
 './data/US1MNWN0033.csv',
 './data/US1WIBF0002.csv',
 './data/US1WILC0003.csv',
 './data/US1WILC0010.csv',
 './data/US1WILC0011.csv',
 './data/US1WILC0018.csv',
 './data/US1WILC0022.csv',
 './data/US1WITR0002.csv',
 './data/US1WITR0003.csv',
 

In [45]:
with open(output_file_path, 'a') as final_output:
    for uncompressed_file in all_uncompressed_station_files:
        with open(uncompressed_file, 'r') as f:
            lines = f.readlines()
            final_output.writelines(lines)