# Using ``rename_epw_files()`` to rename the EPWs for proper data analysis after simulation

``rename_epw_files`` function will rename your EPW files following the naming convention "Country_City_RCPscenario-Year". It will get the Country and City fields from EPW coordinates or user-input dictionaries, and the RCPscenario and Year fields from the original name. If there is no reference to this in the original name, it will consider these to be at Present scenario.

In [1]:
from accim.data.preprocessing import rename_epw_files
help(rename_epw_files)

Help on class rename_epw_files in module accim.data.preprocessing:

class rename_epw_files(builtins.object)
 |  rename_epw_files(filelist=None, rename_city_dict: dict = None, country_to_city_dict: dict = None, confirm_renaming=None, confirm_deletion=None)
 |  
 |  Renames the EPW files following the name convention 'Country_City_RCPscenario-Year'.
 |  The Country and City fields are computed based on the coordinates of the EPW,
 |  and the RCPscenario and Year are taken from the original name.
 |  If no reference is found, the sample_EPWs are considered to be  for Present scenario.
 |  
 |  :param filelist: A list of the EPW files.
 |      If omitted, it will rename all sample_EPWs in that folder.
 |  :type filelist: list
 |  :param rename_city_dict: A dict to set the city field for each EPW file.
 |      It must follow the pattern
 |      {'city name to be search in epw file name': 'city name for the epw file if found`}
 |  :type rename_city_dict: dict
 |  :param country_to_city_dict:

First of all, let's see what files we do have in the folder:

In [1]:
import os
previous_files = [i for i in os.listdir()]
previous_files

['.ipynb_checkpoints',
 'GBR_Aberdeen.Dyce.030910_IWEC.epw',
 'GBR_London.Gatwick.037760_IWEC.epw',
 'RCP26_2100_GC03_Ponta_Grossa.epw',
 'using_rename_epw_files.ipynb',
 '__init__.py']

You can see there are 3 EPW files, which are:

In [2]:
old_epws = [i for i in os.listdir() if i.endswith('.epw')]
print(old_epws)

['GBR_Aberdeen.Dyce.030910_IWEC.epw', 'GBR_London.Gatwick.037760_IWEC.epw', 'RCP26_2100_GC03_Ponta_Grossa.epw']


So let's rename them.

## Using data from coordinates

In [3]:
from accim.data.preprocessing import rename_epw_files
rename_epw_files(
    confirm_deletion=False,
)

Since no match has been found between RCP or SSP scenarios and EPW file name, Present scenario has been assigned to the following EPW files:
GBR_Aberdeen.Dyce.030910_IWEC.epw
GBR_London.Gatwick.037760_IWEC.epw
Since no match has been found between RCP or SSP scenario Year and EPW file name, Present year has been assigned to the following EPW files:
GBR_Aberdeen.Dyce.030910_IWEC.epw
GBR_London.Gatwick.037760_IWEC.epw
The geolocation process has taken: 2.43 seconds (0.81 s/EPW)

The previous and new names of the EPW files and their unique IDs are:
ID: 0 / GBR_Aberdeen.Dyce.030910_IWEC / United-Kingdom_Aberdeen-Dyce_Present
ID: 1 / GBR_London.Gatwick.037760_IWEC / United-Kingdom_London-Gatwick_Present
ID: 2 / RCP26_2100_GC03_Ponta_Grossa / Brazil_Ponta-Grossa_RCP26-2100



If any of the city or subcountry names needs some amendment (if you are not happy with any of the available options, you can exclude it from renaming at the next stage), please enter the EPW IDs separated by space; otherwise, hit enter to omit: 0 1



Regarding the file ID: 0 / old name: GBR_Aberdeen.Dyce.030910_IWEC / new name: United-Kingdom_Aberdeen-Dyce_Present, the address obtained from coordinates is: 
Dyce and Stoneywood, Aberdeen City, Alba / Scotland, AB21 0HJ, United Kingdom


Please enter the amended city or subcountry, which must be unique:  Aberdeen City



Regarding the file ID: 1 / old name: GBR_London.Gatwick.037760_IWEC / new name: United-Kingdom_London-Gatwick_Present, the address obtained from coordinates is: 
London Gatwick Airport, London Road, Lowfield Heath, Tinsley Green, Crawley, West Sussex, England, RH6 0PB, United Kingdom


Please enter the amended city or subcountry, which must be unique:  London City



The previous and new names of the EPW files after city or subcountry name amendments and their unique IDs are:
ID: 0 / GBR_Aberdeen.Dyce.030910_IWEC / United-Kingdom_Aberdeen-City_Present
ID: 1 / GBR_London.Gatwick.037760_IWEC / United-Kingdom_London-City_Present

The final list of previous and new names of the EPW files and their unique IDs is:
ID: 0 / GBR_Aberdeen.Dyce.030910_IWEC / United-Kingdom_Aberdeen-City_Present
ID: 1 / GBR_London.Gatwick.037760_IWEC / United-Kingdom_London-City_Present
ID: 2 / RCP26_2100_GC03_Ponta_Grossa / Brazil_Ponta-Grossa_RCP26-2100



If you want to exclude some EPWs from renaming, please enter the IDs separated by space, otherwise, hit enter to continue: 

Do you want to copy and rename the file or files? [y/n]: y


The file GBR_Aberdeen.Dyce.030910_IWEC has been renamed to United-Kingdom_Aberdeen-City_Present
The file GBR_London.Gatwick.037760_IWEC has been renamed to United-Kingdom_London-City_Present
The file RCP26_2100_GC03_Ponta_Grossa has been renamed to Brazil_Ponta-Grossa_RCP26-2100


<accim.data.preprocessing.rename_epw_files at 0x1f15dec7f40>

You can see above that there was no reference to RCP scenarios in the original EPW file name in 2 of the instances, therefore these has been considered as Present scenario. The same applies to the Year field. Finally, states the previous and the new names of the EPWs. So, now, let's see what files we do have in the folder.

We can see the new EPWs are:

In [4]:
new_epws = [i for i in os.listdir() if not(any(i in j for j in old_epws)) and i.endswith('.epw')]
print(new_epws)

['Brazil_Ponta-Grossa_RCP26-2100.epw', 'United-Kingdom_Aberdeen-City_Present.epw', 'United-Kingdom_London-City_Present.epw']


Let's delete the new files so that we can run the notebook again.

In [5]:
for i in new_epws:
    os.remove(i)

## Using data from user-defined dicts

Sometimes, you might get unexpected error when the data is obtained from the coordinates, so there is also a way to rename the files using user-defined data in shape of dictionaries. Firstly, to define the city or subcountry for the EPW file, we need to input a dictionary such as the following:

In [6]:
rename_city_dict={
        'Aberdeen': 'Aberdeen-city',
        'London': 'London-city'
    }
rename_city_dict

{'Aberdeen': 'Aberdeen-city', 'London': 'London-city'}

If we input that dictionary, accim will look for 'Aberdeen' in the EPW names and set 'Aberdeen-city' in the EPW city or subcountry field. The same with 'London' and 'London-city'.

To define which country is related to the city or subcountry, a different dictionary must be used, in the following shape:

In [7]:
country_to_city_dict={
    'United kingdom': ['Aberdeen-city', 'London-city']
}
country_to_city_dict

{'United kingdom': ['Aberdeen-city', 'London-city']}

In this case, 'United kingdom' will be set as the country for the cities defined in the rename_city_dict argument, in this case, 'Aberdeen-city' and 'London-city'

If any of the cities or countries are not defined, the coordinates will be used. For instance, in this case, 'RCP26_2100_GC03_Ponta_Grossa.epw'. Also, if we did not entirely define the relations city-country, and therefore the country field were not defined, the coordinates would also be used. For instance:

In [8]:
country_to_city_dict_not_entirely_defined={
    'United kingdom': ['Aberdeen-city']
}
country_to_city_dict_not_entirely_defined

{'United kingdom': ['Aberdeen-city']}

In that case, we didn't say the country for 'London-city' is 'United kingdom', and therefore would ne undifined.

In [9]:
from accim.data.preprocessing import rename_epw_files
rename_epw_files(
    confirm_deletion=False,
    rename_city_dict={
        'Aberdeen': 'Aberdeen city',
        'London': 'London city',
        'Ponta_Grossa': 'Ponta Grossa city'
    },
    country_to_city_dict={
        'United kingdom': ['Aberdeen city', 'London city'],
        'Brazil': ['Ponta Grossa city']
    }
)


Since no match has been found between RCP or SSP scenarios and EPW file name, Present scenario has been assigned to the following EPW files:
GBR_Aberdeen.Dyce.030910_IWEC.epw
GBR_London.Gatwick.037760_IWEC.epw
Since no match has been found between RCP or SSP scenario Year and EPW file name, Present year has been assigned to the following EPW files:
GBR_Aberdeen.Dyce.030910_IWEC.epw
GBR_London.Gatwick.037760_IWEC.epw
The geolocation process has taken: 0.0 seconds (0.0 s/EPW)

The previous and new names of the EPW files and their unique IDs are:
ID: 0 / GBR_Aberdeen.Dyce.030910_IWEC / United-Kingdom_Aberdeen-City_Present
ID: 1 / GBR_London.Gatwick.037760_IWEC / United-Kingdom_London-City_Present
ID: 2 / RCP26_2100_GC03_Ponta_Grossa / Brazil_Ponta-Grossa-City_RCP26-2100



If any of the city or subcountry names needs some amendment (if you are not happy with any of the available options, you can exclude it from renaming at the next stage), please enter the EPW IDs separated by space; otherwise, hit enter to omit: 



The final list of previous and new names of the EPW files and their unique IDs is:
ID: 0 / GBR_Aberdeen.Dyce.030910_IWEC / United-Kingdom_Aberdeen-City_Present
ID: 1 / GBR_London.Gatwick.037760_IWEC / United-Kingdom_London-City_Present
ID: 2 / RCP26_2100_GC03_Ponta_Grossa / Brazil_Ponta-Grossa-City_RCP26-2100



If you want to exclude some EPWs from renaming, please enter the IDs separated by space, otherwise, hit enter to continue: 

Do you want to copy and rename the file or files? [y/n]: y


The file GBR_Aberdeen.Dyce.030910_IWEC has been renamed to United-Kingdom_Aberdeen-City_Present
The file GBR_London.Gatwick.037760_IWEC has been renamed to United-Kingdom_London-City_Present
The file RCP26_2100_GC03_Ponta_Grossa has been renamed to Brazil_Ponta-Grossa-City_RCP26-2100


<accim.data.preprocessing.rename_epw_files at 0x1f16f9ef8b0>

We can see the new EPWs are:

In [11]:
new_epws = [i for i in os.listdir() if not(any(i in j for j in old_epws)) and i.endswith('.epw')]
print(new_epws)

['Brazil_Ponta-Grossa-City_RCP26-2100.epw', 'United-Kingdom_Aberdeen-City_Present.epw', 'United-Kingdom_London-City_Present.epw']


Let's delete the new files so that we can run the notebook again.

In [12]:
for i in new_epws:
    os.remove(i)