# Processors

In this notebook we are going to develop the `Processor`  class.
A processor is a specific type of parser, but it calculates the output instead of providing a target endpoint for the downloader to download it. 

For that, the processor will provide a `None` remote folder, indicating the `Downloader` will not "download"  the file and calculate it instead. This switch will be done in the `get_file` from the `Downloader`. the final idea is that, the user can use `get_file` regardless the file exists in the remote directory. 

In [198]:
%load_ext autoreload
%autoreload 2

The autoreload extension is already loaded. To reload it, use:
  %reload_ext autoreload


## The `MonthlyAccumManual` Class

Now that we have a class to download the files and the parsers to understand the remote structure, let's take a look at the `Downloader` class, that combines both structures

In [199]:
import logging
from datetime import datetime
from mergedownloader.inpeparser import *
from mergedownloader.downloader import Downloader
from mergedownloader.file_downloader import FileDownloader, ConnectionType, DownloadMode



In [200]:
fd = FileDownloader(server=INPE_SERVER, connection_type=ConnectionType.HTTP, download_mode=DownloadMode.UPDATE)


Using wget through HTTP on: ftp.cptec.inpe.br


In [201]:
downloader = Downloader(
    file_downloader=fd,
    parsers=InpeParsers,
    local_folder='/tmp2',
    log_level=logging.DEBUG
)

In [202]:
DATE = datetime(2022, 1, 1)

processor = downloader.get_parser(InpeTypes.MONTHLY_ACCUM_MANUAL)

print(processor.filename(DATE))
print(processor.foldername(DATE))
print(processor.remote_folder(DATE))
print(processor.remote_target(DATE))
print(processor.local_folder(DATE, '/tmp'))

MERGE_CPTEC_acum_jan_2022.nc
MONTHLY_ACCUM_MANUAL
None
None
/tmp/MONTHLY_ACCUM_MANUAL


In [203]:
parser = DailyParser()

In [204]:
dates = parser.dates_range(start_date='2024-09-01', end_date='2024-09-30')

In [205]:
# files = downloader.get_files(dates, InpeTypes.DAILY_RAIN)

In [206]:
# files

In [207]:
# downloader.open_file(date='2024-09-22', datatype=InpeTypes.MONTHLY_ACCUM_MANUAL)

In [208]:
from mergedownloader.parser import AbstractProcessor, AbstractParser

In [209]:
processor = downloader.get_parser(InpeTypes.DAILY_RAIN)

In [210]:
isinstance(processor, AbstractParser)

True

In [224]:
dset = downloader.open_file(date='2024-02-23', datatype=InpeTypes.MONTHLY_ACCUM_MANUAL)
dset

In [225]:
xr.open_dataset('/tmp2/MONTHLY_ACCUM_MANUAL/MERGE_CPTEC_acum_feb_2024.nc')

In [223]:
dset.rio.crs

CRS.from_epsg(4326)

In [218]:
dset = xr.open_dataset(dset)

In [219]:
dset.rio.crs

CRS.from_epsg(4326)

User asks to get a file:
the get file checks if it's a parser or a processor
if that's a parser, calls the download method and passes the parser

if that's a processor, calls the processor method and passes the processor
   In the processor method, asks the processor the files it needs (dates and data type), 
   then call back the processor class to create the file with the links to the files, or the actual files (already opened)... this way, the Processor does not have to communicate with the downloader itself. Or take care of any downloading process... 
   