# Phase 3: Producing and Implementing
- Python, Pandas and Matplotlib: Use Python Pandas to read and manipulate data. Create relevant charts
and visualisations using Matplotlib.
- User Interface: How does the user find and generate the data? In a separate Python program, create an
intuitive text-based or graphical user interface (GUI). For full marks, you need a GUI.

## Get Your Data: Acquire any data you need, whether it be from .txt files, .csv files or APIs. For full marks, you need to be using APIs.
I hope FTP counts as an API

In [10]:
# Imports
import ftplib, zipfile, io

print("Connecting...")
server = ftplib.FTP()
server.connect("134.178.253.145")
print("Logging in...")
server.login()
print("Changing directory...")
server.cwd('anon/home/ncc/www/change/ACORN_SAT_daily/')

print("Downloading file (may take a while)...")
newf = io.BytesIO()
server.retrbinary('RETR v2.4-raw-data-and-supporting-information.zip', newf.write)
print("Extracting file...")
z = zipfile.ZipFile(newf)
print("Finding files and taking their contents...")
dirs = [i for i in z.namelist() if i.startswith('raw-data/') if i != 'raw-data/Raw data.7z' and i != 'raw-data/']
datas = [z.open(i).read().decode() for i in dirs]
print("Quitting...")
server.quit()
z

Connecting...
Logging in...
Changing directory...
Downloading file (may take a while)...
Extracting file...
Finding files and taking their contents...
Quitting...


<zipfile.ZipFile file=<_io.BytesIO object at 0x74995035c310> mode='r'>

In [11]:
print(z.open("readme.txt").read().decode())

This set of files contains the following:

- homogenised ACORN-SAT data
- raw station data for stations corresponding to the ACORN-SAT locations
- a file (primarysites.txt) with information on which site is the primary site for each ACORN-SAT occasion and for which period of time
(in most cases, there will be two or three primary sites which make up the input data for the overall ACORN-SAT record)
- a summary of adjustments and reference periods and stations. This information is also contained in the station catalogue.
- each of the transfer functions for these adjustments.

Format of raw data files

These file names have the file name hqnewNNNNNN, where NNNNNN is the station number.

Each day of data has the format:

NNNNNN YYYYMMDD  XXX  NNN

where NNNNNN is the station number, YYYYMMDD is the date, XXX is the maximum temperature in tenths of degrees C
(e.g. 251 = 25.1 C) and NNN is the minimum temperature. Missing data is shown as -999.

These data files have been quality controlled

In [12]:
datas[0]

'01609819971001   330  120\r\n01609819971002   280  120\r\n01609819971003   200  110\r\n01609819971004   180  120\r\n01609819971005   250   90\r\n01609819971006   190  110\r\n01609819971007   220   80\r\n01609819971008   290   90\r\n01609819971009   340  160\r\n01609819971010   340  240\r\n01609819971011   280  110\r\n01609819971012   240   90\r\n01609819971013   260   70\r\n01609819971014   300   90\r\n01609819971015   260   70\r\n01609819971016   310   90\r\n01609819971017   220  160\r\n01609819971018   140  110\r\n01609819971019   210   70\r\n01609819971020   230   60\r\n01609819971021   250  120\r\n01609819971022   280  110\r\n01609819971023   320  120\r\n01609819971024   350  140\r\n01609819971025   380  180\r\n01609819971026   290  190\r\n01609819971027   280  170\r\n01609819971028   320  200\r\n01609819971029   320  150\r\n01609819971030  -999  150\r\n01609819971031   200 -999\r\n01609819971101   230   70\r\n01609819971102   250   90\r\n01609819971103   260  110\r\n0160981997110