# Renaming CSV Files

This notebook renames the CSV files of the ESA project. The filenames in the SQL database are not very descriptive, therefore it was important to change the filenames for a better user experience.

The current filenames look something like this: 1059614_14_lattice-v_1.csv

In this notebook, we will rename them to something like: nrthmntn_table-3-summary-of-aquatics-field-work-and-aborigi_pt-1_pg-14_doc-num-A3Q6H2.csv

The new filename contains the following information:

- Short Name of Project (Ex: nrthmntn)
- Table Title of the extracted table (Ex: table-3-summary-of-aquatics-field-work-and-aborigi; length of title is shortened to keep filepath under 256 characters)
- The Part of the table (we have a counter for each additional CSV of a particular table)
- Page Number of the extracted CSV
- Document Number of the particular PDF



In [72]:
import pandas as pd
import os
import glob
import shutil
from time import localtime, strftime

In [73]:
pd.set_option('display.max_columns', 500)
pd.set_option('display.max_rows', 1000)
pd.set_option('display.max_colwidth', 255)

In [74]:
# filepath to the English and French index files
ENG_index_filepath = 'F:/Environmental Baseline Data/Version 4 - Final/Indices/ESA_website_ENG_03032021.csv'
# FRA_index_filepath = 'F:/Environmental Baseline Data/Version 4 - Final/Indices/ESA_website_FRA_titles_sections_translated.xlsx'
FRA_index_filepath = 'F:/Environmental Baseline Data/Version 4 - Final/Indices/ESA_website_FRA_03032021.csv'

In [75]:
# Loading index file of all tables
df = pd.read_csv(ENG_index_filepath, encoding='ISO-8859-1')
# df_FRA = pd.read_excel(FRA_index_filepath)
df_FRA = pd.read_csv(FRA_index_filepath, encoding='ISO-8859-1')

In [76]:
df.head()

Unnamed: 0,Title,Content Type,Application Name,Application Short Name,Application Filing Date,Company Name,Commodity,File Name,ESA Folder URL,Document Number,Data ID,PDF Download URL,Application Type (NEB Act),Pipeline Location,Hearing order,Consultant Name,Pipeline Status,Regulatory Instrument(s),Application URL,Decision URL,ESA Section(s),ESA Section(s) Index,ESA Section(s) Topics,CSV Download URL,PDF Page Number,PDF Page Count,PDF Size,PDF Outline,Download folder name,Zipped Project Link
0,Figure 13.1-1 EnCana Ekwan Pipeline,Figure,Application to Construct and Operate Ekwan Pipeline,Ekwan,3/17/2003,EnCana Ekwan Pipeline Inc.,Gas,A0H8C0 - 13.0 EIA - Section 13.1 to 13.6,https://apps.cer-rec.gc.ca/REGDOCS/Item/LoadResult/268693,A0H8C0,268706,https://apps.cer-rec.gc.ca/REGDOCS/File/Download/268706,Large Projects (over 40 km),"Alberta, British Columbia, All",GH-1-2003,AXYS Environmental Consulting Ltd.,Operating,GC-108,https://apps.cer-rec.gc.ca/REGDOCS/Item/View/268876,https://apps.cer-rec.gc.ca/REGDOCS/Item/View/293763,"Section 13.1: Introduction, Section 13.2: Project Description, Section 13.3: Assessment Methods, Section 13.4: Air, Section 13.5: Terrain and Soils, Section 13.6: Vegetation",1.0,"Land, Air, Vegetation, All",,26,107.0,1.41,Yes,kwn,http://www.cer-rec.gc.ca/esa-ees/kwn.zip
1,Figure 13.3-1 CEA Framework,Figure,Application to Construct and Operate Ekwan Pipeline,Ekwan,3/17/2003,EnCana Ekwan Pipeline Inc.,Gas,A0H8C0 - 13.0 EIA - Section 13.1 to 13.6,https://apps.cer-rec.gc.ca/REGDOCS/Item/LoadResult/268693,A0H8C0,268706,https://apps.cer-rec.gc.ca/REGDOCS/File/Download/268706,Large Projects (over 40 km),"Alberta, British Columbia, All",GH-1-2003,AXYS Environmental Consulting Ltd.,Operating,GC-108,https://apps.cer-rec.gc.ca/REGDOCS/Item/View/268876,https://apps.cer-rec.gc.ca/REGDOCS/Item/View/293763,"Section 13.1: Introduction, Section 13.2: Project Description, Section 13.3: Assessment Methods, Section 13.4: Air, Section 13.5: Terrain and Soils, Section 13.6: Vegetation",1.0,"Land, Air, Vegetation, All",,41,107.0,1.41,Yes,kwn,http://www.cer-rec.gc.ca/esa-ees/kwn.zip
2,Figure 13.4-1 Temperature Normals Measured at the Fort Nelson Airport for the Period 1971 to 2000,Figure,Application to Construct and Operate Ekwan Pipeline,Ekwan,3/17/2003,EnCana Ekwan Pipeline Inc.,Gas,A0H8C0 - 13.0 EIA - Section 13.1 to 13.6,https://apps.cer-rec.gc.ca/REGDOCS/Item/LoadResult/268693,A0H8C0,268706,https://apps.cer-rec.gc.ca/REGDOCS/File/Download/268706,Large Projects (over 40 km),"Alberta, British Columbia, All",GH-1-2003,AXYS Environmental Consulting Ltd.,Operating,GC-108,https://apps.cer-rec.gc.ca/REGDOCS/Item/View/268876,https://apps.cer-rec.gc.ca/REGDOCS/Item/View/293763,"Section 13.1: Introduction, Section 13.2: Project Description, Section 13.3: Assessment Methods, Section 13.4: Air, Section 13.5: Terrain and Soils, Section 13.6: Vegetation",1.0,"Land, Air, Vegetation, All",,44,107.0,1.41,Yes,kwn,http://www.cer-rec.gc.ca/esa-ees/kwn.zip
3,Figure 13.4-2 Mean Monthly Rainfall and Number of Days with Measurable Rainfall Observed at the Fort Nelson Airport for the Period 1971 to 2000,Figure,Application to Construct and Operate Ekwan Pipeline,Ekwan,3/17/2003,EnCana Ekwan Pipeline Inc.,Gas,A0H8C0 - 13.0 EIA - Section 13.1 to 13.6,https://apps.cer-rec.gc.ca/REGDOCS/Item/LoadResult/268693,A0H8C0,268706,https://apps.cer-rec.gc.ca/REGDOCS/File/Download/268706,Large Projects (over 40 km),"Alberta, British Columbia, All",GH-1-2003,AXYS Environmental Consulting Ltd.,Operating,GC-108,https://apps.cer-rec.gc.ca/REGDOCS/Item/View/268876,https://apps.cer-rec.gc.ca/REGDOCS/Item/View/293763,"Section 13.1: Introduction, Section 13.2: Project Description, Section 13.3: Assessment Methods, Section 13.4: Air, Section 13.5: Terrain and Soils, Section 13.6: Vegetation",1.0,"Land, Air, Vegetation, All",,45,107.0,1.41,Yes,kwn,http://www.cer-rec.gc.ca/esa-ees/kwn.zip
4,Figure 13.4-3 Mean Monthly Snowfall and Number of Days with Measurable Snowfall Observed at the Fort Nelson Airport for the Period 1971 to 2000,Figure,Application to Construct and Operate Ekwan Pipeline,Ekwan,3/17/2003,EnCana Ekwan Pipeline Inc.,Gas,A0H8C0 - 13.0 EIA - Section 13.1 to 13.6,https://apps.cer-rec.gc.ca/REGDOCS/Item/LoadResult/268693,A0H8C0,268706,https://apps.cer-rec.gc.ca/REGDOCS/File/Download/268706,Large Projects (over 40 km),"Alberta, British Columbia, All",GH-1-2003,AXYS Environmental Consulting Ltd.,Operating,GC-108,https://apps.cer-rec.gc.ca/REGDOCS/Item/View/268876,https://apps.cer-rec.gc.ca/REGDOCS/Item/View/293763,"Section 13.1: Introduction, Section 13.2: Project Description, Section 13.3: Assessment Methods, Section 13.4: Air, Section 13.5: Terrain and Soils, Section 13.6: Vegetation",1.0,"Land, Air, Vegetation, All",,46,107.0,1.41,Yes,kwn,http://www.cer-rec.gc.ca/esa-ees/kwn.zip


In [77]:
df_FRA.head()

Unnamed: 0,Titre,Type de contenu,Nom de la demande,Nom abrégé de la demande,Dépôt de la demande,Nom de la société,Produit de base,Nom de fichier,URL du dossier de lÉES,Numéro de document,Identificateur de données,URL de téléchargement PDF,Type de demande (Loi sur lOffice national de lénergie),Emplacement du pipeline,Ordonnance daudience,Nom du consultant,État d'avancement,Instruments réglementaires,URL de la demande,URL de la décision,Sections de lEES,Index des sections de lÉES,Sujets des sections de lÉES,URL de téléchargement CSV,Numéro de page PDF,Nombre de pages PDF,Taille PDF,Aperçu PDF,Télécharger le nom du dossier,Lien vers le projet compressé
0,Figure 13.1-1 Pipeline EnCana Ekwan,Figure,Demande visant la construction et lexploitation du pipeline Ekwan,Ekwan,3/17/2003,EnCana Ekwan Pipeline Inc.,Gaz,A0H8C0 - 13.0 EIA - Section 13.1 to 13.6,https://apps.cer-rec.gc.ca/REGDOCS/Item/LoadResult/268693,A0H8C0,268706,https://apps.cer-rec.gc.ca/REGDOCS/File/Download/268706,Projets de grande envergure (plus de 40 km),"Alberta, Colombie britannique",GH-1-2003,AXYS Environmental Consulting Ltd.,En exploitation,GC-108,https://apps.cer-rec.gc.ca/REGDOCS/Item/View/268876,https://apps.cer-rec.gc.ca/REGDOCS/%C3%89l%C3%A9ment/Afficher/293763,"Section 13.1 : Introduction, Section 13.2 : Description du projet, Section 13.3 : Méthodes d'évaluation, Section 13.4 : Air, Section 13.5 : Terrain et sols, Section 13.6 : Végétation",1.0,"Terres, Air, Végétation",,26,107.0,1.41,Oui,kwn,http://www.cer-rec.gc.ca/esa-ees/kwn.zip
1,Figure 13.3-1 Cadre de travail de lACEE,Figure,Demande visant la construction et lexploitation du pipeline Ekwan,Ekwan,3/17/2003,EnCana Ekwan Pipeline Inc.,Gaz,A0H8C0 - 13.0 EIA - Section 13.1 to 13.6,https://apps.cer-rec.gc.ca/REGDOCS/Item/LoadResult/268693,A0H8C0,268706,https://apps.cer-rec.gc.ca/REGDOCS/File/Download/268706,Projets de grande envergure (plus de 40 km),"Alberta, Colombie britannique",GH-1-2003,AXYS Environmental Consulting Ltd.,En exploitation,GC-108,https://apps.cer-rec.gc.ca/REGDOCS/Item/View/268876,https://apps.cer-rec.gc.ca/REGDOCS/%C3%89l%C3%A9ment/Afficher/293763,"Section 13.1 : Introduction, Section 13.2 : Description du projet, Section 13.3 : Méthodes d'évaluation, Section 13.4 : Air, Section 13.5 : Terrain et sols, Section 13.6 : Végétation",1.0,"Terres, Air, Végétation",,41,107.0,1.41,Oui,kwn,http://www.cer-rec.gc.ca/esa-ees/kwn.zip
2,Figure 13.4-1 Normales de température mesurées à laéroport de Fort Nelson pour la période de 1971 à 2000,Figure,Demande visant la construction et lexploitation du pipeline Ekwan,Ekwan,3/17/2003,EnCana Ekwan Pipeline Inc.,Gaz,A0H8C0 - 13.0 EIA - Section 13.1 to 13.6,https://apps.cer-rec.gc.ca/REGDOCS/Item/LoadResult/268693,A0H8C0,268706,https://apps.cer-rec.gc.ca/REGDOCS/File/Download/268706,Projets de grande envergure (plus de 40 km),"Alberta, Colombie britannique",GH-1-2003,AXYS Environmental Consulting Ltd.,En exploitation,GC-108,https://apps.cer-rec.gc.ca/REGDOCS/Item/View/268876,https://apps.cer-rec.gc.ca/REGDOCS/%C3%89l%C3%A9ment/Afficher/293763,"Section 13.1 : Introduction, Section 13.2 : Description du projet, Section 13.3 : Méthodes d'évaluation, Section 13.4 : Air, Section 13.5 : Terrain et sols, Section 13.6 : Végétation",1.0,"Terres, Air, Végétation",,44,107.0,1.41,Oui,kwn,http://www.cer-rec.gc.ca/esa-ees/kwn.zip
3,Figure 13.4-2 Pluie mensuelle moyenne et nombre de jours de pluie mesurable observés à laéroport de Fort Nelson pour la période de 1971 à 2000,Figure,Demande visant la construction et lexploitation du pipeline Ekwan,Ekwan,3/17/2003,EnCana Ekwan Pipeline Inc.,Gaz,A0H8C0 - 13.0 EIA - Section 13.1 to 13.6,https://apps.cer-rec.gc.ca/REGDOCS/Item/LoadResult/268693,A0H8C0,268706,https://apps.cer-rec.gc.ca/REGDOCS/File/Download/268706,Projets de grande envergure (plus de 40 km),"Alberta, Colombie britannique",GH-1-2003,AXYS Environmental Consulting Ltd.,En exploitation,GC-108,https://apps.cer-rec.gc.ca/REGDOCS/Item/View/268876,https://apps.cer-rec.gc.ca/REGDOCS/%C3%89l%C3%A9ment/Afficher/293763,"Section 13.1 : Introduction, Section 13.2 : Description du projet, Section 13.3 : Méthodes d'évaluation, Section 13.4 : Air, Section 13.5 : Terrain et sols, Section 13.6 : Végétation",1.0,"Terres, Air, Végétation",,45,107.0,1.41,Oui,kwn,http://www.cer-rec.gc.ca/esa-ees/kwn.zip
4,Figure 13.4-3 Chute de neige mensuelle moyenne et nombre de jours avec chute de neige mesurable observés à laéroport de Fort Nelson pour la période de 1971 à 2000,Figure,Demande visant la construction et lexploitation du pipeline Ekwan,Ekwan,3/17/2003,EnCana Ekwan Pipeline Inc.,Gaz,A0H8C0 - 13.0 EIA - Section 13.1 to 13.6,https://apps.cer-rec.gc.ca/REGDOCS/Item/LoadResult/268693,A0H8C0,268706,https://apps.cer-rec.gc.ca/REGDOCS/File/Download/268706,Projets de grande envergure (plus de 40 km),"Alberta, Colombie britannique",GH-1-2003,AXYS Environmental Consulting Ltd.,En exploitation,GC-108,https://apps.cer-rec.gc.ca/REGDOCS/Item/View/268876,https://apps.cer-rec.gc.ca/REGDOCS/%C3%89l%C3%A9ment/Afficher/293763,"Section 13.1 : Introduction, Section 13.2 : Description du projet, Section 13.3 : Méthodes d'évaluation, Section 13.4 : Air, Section 13.5 : Terrain et sols, Section 13.6 : Végétation",1.0,"Terres, Air, Végétation",,46,107.0,1.41,Oui,kwn,http://www.cer-rec.gc.ca/esa-ees/kwn.zip


In [78]:
# Remove all rows for figures so that we are only moving tables
df = df[df['Content Type'] == 'Table']
df_FRA = df_FRA[df_FRA['Type de contenu'] == 'Tableau']

In [79]:
df_bad_DataIDs = pd.read_csv("H:/esa-intermediate-code/data_id_to_be_excluded.csv")

In [80]:
df['bad_csv'] = df['Data ID'].isin(list(df_bad_DataIDs.values))
df_FRA['mauvais_csv'] = df_FRA['Identificateur de données'].isin(list(df_bad_DataIDs.values))

In [81]:
df.head()

Unnamed: 0,Title,Content Type,Application Name,Application Short Name,Application Filing Date,Company Name,Commodity,File Name,ESA Folder URL,Document Number,Data ID,PDF Download URL,Application Type (NEB Act),Pipeline Location,Hearing order,Consultant Name,Pipeline Status,Regulatory Instrument(s),Application URL,Decision URL,ESA Section(s),ESA Section(s) Index,ESA Section(s) Topics,CSV Download URL,PDF Page Number,PDF Page Count,PDF Size,PDF Outline,Download folder name,Zipped Project Link,bad_csv
9134,TABLE 3 SUMMARY OF AQUATICS FIELD WORK AND ABORIGINAL FIELD STUDY PARTICIPATION FOR THE PROJECT,Table,Application for North Montney Project,North Montney,11/8/2013,NOVA Gas Transmission Ltd.,Gas,B2-16 ESA_Appendix_G_Part1of4 (A3Q6H2),https://apps.cer-rec.gc.ca/REGDOCS/Item/LoadResult/1060040,A3Q6H2,1059614,https://apps.cer-rec.gc.ca/REGDOCS/File/Download/1059614,Large Projects (over 40 km),"British Columbia, All",GH-001-2014,"Stantec Consulting Ltd., TERA Environmental Consultants",Operating,GC-125,https://apps.cer-rec.gc.ca/REGDOCS/Item/View/1060220,https://apps.cer-rec.gc.ca/REGDOCS/Item/View/3890551,Appendix G: TERA Aquatics Summary Report,15.0,"Water, All",http://www.cer-rec.gc.ca/esa-ees/nrthmntn/1059614_14_1.csv,14,48.0,5.87,No,nrthmntn,http://www.cer-rec.gc.ca/esa-ees/nrthmntn.zip,False
9135,TABLE 4 SUMMARY OF WATERCOURSE CROSSINGS ALONG THE NORTH MONTNEY MAINLINE (AITKEN CREEK SECTION),Table,Application for North Montney Project,North Montney,11/8/2013,NOVA Gas Transmission Ltd.,Gas,B2-16 ESA_Appendix_G_Part1of4 (A3Q6H2),https://apps.cer-rec.gc.ca/REGDOCS/Item/LoadResult/1060040,A3Q6H2,1059614,https://apps.cer-rec.gc.ca/REGDOCS/File/Download/1059614,Large Projects (over 40 km),"British Columbia, All",GH-001-2014,"Stantec Consulting Ltd., TERA Environmental Consultants",Operating,GC-125,https://apps.cer-rec.gc.ca/REGDOCS/Item/View/1060220,https://apps.cer-rec.gc.ca/REGDOCS/Item/View/3890551,Appendix G: TERA Aquatics Summary Report,15.0,"Water, All",http://www.cer-rec.gc.ca/esa-ees/nrthmntn/1059614_17_1.csv,17,48.0,5.87,No,nrthmntn,http://www.cer-rec.gc.ca/esa-ees/nrthmntn.zip,False
9136,TABLE 4 SUMMARY OF WATERCOURSE CROSSINGS ALONG THE NORTH MONTNEY MAINLINE (AITKEN CREEK SECTION),Table,Application for North Montney Project,North Montney,11/8/2013,NOVA Gas Transmission Ltd.,Gas,B2-16 ESA_Appendix_G_Part1of4 (A3Q6H2),https://apps.cer-rec.gc.ca/REGDOCS/Item/LoadResult/1060040,A3Q6H2,1059614,https://apps.cer-rec.gc.ca/REGDOCS/File/Download/1059614,Large Projects (over 40 km),"British Columbia, All",GH-001-2014,"Stantec Consulting Ltd., TERA Environmental Consultants",Operating,GC-125,https://apps.cer-rec.gc.ca/REGDOCS/Item/View/1060220,https://apps.cer-rec.gc.ca/REGDOCS/Item/View/3890551,Appendix G: TERA Aquatics Summary Report,15.0,"Water, All",http://www.cer-rec.gc.ca/esa-ees/nrthmntn/1059614_18_1.csv,18,48.0,5.87,No,nrthmntn,http://www.cer-rec.gc.ca/esa-ees/nrthmntn.zip,False
9137,TABLE 4 SUMMARY OF WATERCOURSE CROSSINGS ALONG THE NORTH MONTNEY MAINLINE (AITKEN CREEK SECTION),Table,Application for North Montney Project,North Montney,11/8/2013,NOVA Gas Transmission Ltd.,Gas,B2-16 ESA_Appendix_G_Part1of4 (A3Q6H2),https://apps.cer-rec.gc.ca/REGDOCS/Item/LoadResult/1060040,A3Q6H2,1059614,https://apps.cer-rec.gc.ca/REGDOCS/File/Download/1059614,Large Projects (over 40 km),"British Columbia, All",GH-001-2014,"Stantec Consulting Ltd., TERA Environmental Consultants",Operating,GC-125,https://apps.cer-rec.gc.ca/REGDOCS/Item/View/1060220,https://apps.cer-rec.gc.ca/REGDOCS/Item/View/3890551,Appendix G: TERA Aquatics Summary Report,15.0,"Water, All",http://www.cer-rec.gc.ca/esa-ees/nrthmntn/1059614_19_1.csv,19,48.0,5.87,No,nrthmntn,http://www.cer-rec.gc.ca/esa-ees/nrthmntn.zip,False
9138,TABLE 4 SUMMARY OF WATERCOURSE CROSSINGS ALONG THE NORTH MONTNEY MAINLINE (AITKEN CREEK SECTION),Table,Application for North Montney Project,North Montney,11/8/2013,NOVA Gas Transmission Ltd.,Gas,B2-16 ESA_Appendix_G_Part1of4 (A3Q6H2),https://apps.cer-rec.gc.ca/REGDOCS/Item/LoadResult/1060040,A3Q6H2,1059614,https://apps.cer-rec.gc.ca/REGDOCS/File/Download/1059614,Large Projects (over 40 km),"British Columbia, All",GH-001-2014,"Stantec Consulting Ltd., TERA Environmental Consultants",Operating,GC-125,https://apps.cer-rec.gc.ca/REGDOCS/Item/View/1060220,https://apps.cer-rec.gc.ca/REGDOCS/Item/View/3890551,Appendix G: TERA Aquatics Summary Report,15.0,"Water, All",http://www.cer-rec.gc.ca/esa-ees/nrthmntn/1059614_20_1.csv,20,48.0,5.87,No,nrthmntn,http://www.cer-rec.gc.ca/esa-ees/nrthmntn.zip,False


In [82]:
# Dropping old index column to create new one
# df.drop(columns=['Unnamed: 0'], inplace=True)
df = df.reset_index()
df.rename(columns = {"index": "Index"}, inplace = True) 

# df_FRA.drop(columns=['Unnamed: 0'], inplace=True)
df_FRA = df_FRA.reset_index()
df_FRA.rename(columns = {"index": "Indice"}, inplace = True)

In [83]:
df_FRA.head()

Unnamed: 0,Indice,Titre,Type de contenu,Nom de la demande,Nom abrégé de la demande,Dépôt de la demande,Nom de la société,Produit de base,Nom de fichier,URL du dossier de lÉES,Numéro de document,Identificateur de données,URL de téléchargement PDF,Type de demande (Loi sur lOffice national de lénergie),Emplacement du pipeline,Ordonnance daudience,Nom du consultant,État d'avancement,Instruments réglementaires,URL de la demande,URL de la décision,Sections de lEES,Index des sections de lÉES,Sujets des sections de lÉES,URL de téléchargement CSV,Numéro de page PDF,Nombre de pages PDF,Taille PDF,Aperçu PDF,Télécharger le nom du dossier,Lien vers le projet compressé,mauvais_csv
0,9134,TABLEAU 3 RÉSUMÉ DU TRAVAIL SUR LE TERRAIN AQUATIQUE ET DE LA PARTICIPATION DES AUTOCHTONES À LÉTUDE SUR LE TERRAIN POUR LE PROJET,Tableau,Demande visant le projet North Montney,North Montney,11/8/2013,NOVA Gas Transmission Ltd.,Gaz,B2-16 ESA_Appendix_G_Part1of4 (A3Q6H2),https://apps.cer-rec.gc.ca/REGDOCS/Item/LoadResult/1060040,A3Q6H2,1059614,https://apps.cer-rec.gc.ca/REGDOCS/File/Download/1059614,Projets de grande envergure (plus de 40 km),Colombie britannique,GH-001-2014,"Stantec Consulting Ltd., TERA Environmental Consultants",En exploitation,GC-125,https://apps.cer-rec.gc.ca/REGDOCS/Item/View/1060220,https://apps.cer-rec.gc.ca/REGDOCS/%C3%89l%C3%A9ment/Afficher/3890551,Annexe G : Rapport sommaire de TERA sur le milieu aquatique,15.0,Eau,http://www.cer-rec.gc.ca/esa-ees/nrthmntn/1059614_14_1.csv,14,48.0,5.87,Non,nrthmntn,http://www.cer-rec.gc.ca/esa-ees/nrthmntn.zip,False
1,9135,TABLEAU 4 RÉSUMÉ DES FRANCHISSEMENTS DE COURS DEAU DE LA CANALISATION PRINCIPALE NORTH MONTNEY (TRONÇON AITKEN CREEK),Tableau,Demande visant le projet North Montney,North Montney,11/8/2013,NOVA Gas Transmission Ltd.,Gaz,B2-16 ESA_Appendix_G_Part1of4 (A3Q6H2),https://apps.cer-rec.gc.ca/REGDOCS/Item/LoadResult/1060040,A3Q6H2,1059614,https://apps.cer-rec.gc.ca/REGDOCS/File/Download/1059614,Projets de grande envergure (plus de 40 km),Colombie britannique,GH-001-2014,"Stantec Consulting Ltd., TERA Environmental Consultants",En exploitation,GC-125,https://apps.cer-rec.gc.ca/REGDOCS/Item/View/1060220,https://apps.cer-rec.gc.ca/REGDOCS/%C3%89l%C3%A9ment/Afficher/3890551,Annexe G : Rapport sommaire de TERA sur le milieu aquatique,15.0,Eau,http://www.cer-rec.gc.ca/esa-ees/nrthmntn/1059614_17_1.csv,17,48.0,5.87,Non,nrthmntn,http://www.cer-rec.gc.ca/esa-ees/nrthmntn.zip,False
2,9136,TABLEAU 4 RÉSUMÉ DES FRANCHISSEMENTS DE COURS DEAU DE LA CANALISATION PRINCIPALE NORTH MONTNEY (TRONÇON AITKEN CREEK),Tableau,Demande visant le projet North Montney,North Montney,11/8/2013,NOVA Gas Transmission Ltd.,Gaz,B2-16 ESA_Appendix_G_Part1of4 (A3Q6H2),https://apps.cer-rec.gc.ca/REGDOCS/Item/LoadResult/1060040,A3Q6H2,1059614,https://apps.cer-rec.gc.ca/REGDOCS/File/Download/1059614,Projets de grande envergure (plus de 40 km),Colombie britannique,GH-001-2014,"Stantec Consulting Ltd., TERA Environmental Consultants",En exploitation,GC-125,https://apps.cer-rec.gc.ca/REGDOCS/Item/View/1060220,https://apps.cer-rec.gc.ca/REGDOCS/%C3%89l%C3%A9ment/Afficher/3890551,Annexe G : Rapport sommaire de TERA sur le milieu aquatique,15.0,Eau,http://www.cer-rec.gc.ca/esa-ees/nrthmntn/1059614_18_1.csv,18,48.0,5.87,Non,nrthmntn,http://www.cer-rec.gc.ca/esa-ees/nrthmntn.zip,False
3,9137,TABLEAU 4 RÉSUMÉ DES FRANCHISSEMENTS DE COURS DEAU DE LA CANALISATION PRINCIPALE NORTH MONTNEY (TRONÇON AITKEN CREEK),Tableau,Demande visant le projet North Montney,North Montney,11/8/2013,NOVA Gas Transmission Ltd.,Gaz,B2-16 ESA_Appendix_G_Part1of4 (A3Q6H2),https://apps.cer-rec.gc.ca/REGDOCS/Item/LoadResult/1060040,A3Q6H2,1059614,https://apps.cer-rec.gc.ca/REGDOCS/File/Download/1059614,Projets de grande envergure (plus de 40 km),Colombie britannique,GH-001-2014,"Stantec Consulting Ltd., TERA Environmental Consultants",En exploitation,GC-125,https://apps.cer-rec.gc.ca/REGDOCS/Item/View/1060220,https://apps.cer-rec.gc.ca/REGDOCS/%C3%89l%C3%A9ment/Afficher/3890551,Annexe G : Rapport sommaire de TERA sur le milieu aquatique,15.0,Eau,http://www.cer-rec.gc.ca/esa-ees/nrthmntn/1059614_19_1.csv,19,48.0,5.87,Non,nrthmntn,http://www.cer-rec.gc.ca/esa-ees/nrthmntn.zip,False
4,9138,TABLEAU 4 RÉSUMÉ DES FRANCHISSEMENTS DE COURS DEAU DE LA CANALISATION PRINCIPALE NORTH MONTNEY (TRONÇON AITKEN CREEK),Tableau,Demande visant le projet North Montney,North Montney,11/8/2013,NOVA Gas Transmission Ltd.,Gaz,B2-16 ESA_Appendix_G_Part1of4 (A3Q6H2),https://apps.cer-rec.gc.ca/REGDOCS/Item/LoadResult/1060040,A3Q6H2,1059614,https://apps.cer-rec.gc.ca/REGDOCS/File/Download/1059614,Projets de grande envergure (plus de 40 km),Colombie britannique,GH-001-2014,"Stantec Consulting Ltd., TERA Environmental Consultants",En exploitation,GC-125,https://apps.cer-rec.gc.ca/REGDOCS/Item/View/1060220,https://apps.cer-rec.gc.ca/REGDOCS/%C3%89l%C3%A9ment/Afficher/3890551,Annexe G : Rapport sommaire de TERA sur le milieu aquatique,15.0,Eau,http://www.cer-rec.gc.ca/esa-ees/nrthmntn/1059614_20_1.csv,20,48.0,5.87,Non,nrthmntn,http://www.cer-rec.gc.ca/esa-ees/nrthmntn.zip,False


In [84]:
# Creating the names of each csv file
df['filename'] = df['Download folder name'] + '_' + df['Title'].str.lower().str.replace('(', '').str.replace(')', '').str.replace(' ', '-').str.replace('.', '-').str.replace('[^\w+-]', '').str.slice(0,50)

df_FRA['nom_du_fichier'] = df_FRA['Télécharger le nom du dossier'] + '_' + df_FRA['Titre'].str.lower().str.replace('(', '').str.replace(')', '').str.replace(' ', '-').str.replace('.', '-').str.replace('[^\w+-]', '').str.slice(0,50)

In [85]:
df.head()

Unnamed: 0,Index,Title,Content Type,Application Name,Application Short Name,Application Filing Date,Company Name,Commodity,File Name,ESA Folder URL,Document Number,Data ID,PDF Download URL,Application Type (NEB Act),Pipeline Location,Hearing order,Consultant Name,Pipeline Status,Regulatory Instrument(s),Application URL,Decision URL,ESA Section(s),ESA Section(s) Index,ESA Section(s) Topics,CSV Download URL,PDF Page Number,PDF Page Count,PDF Size,PDF Outline,Download folder name,Zipped Project Link,bad_csv,filename
0,9134,TABLE 3 SUMMARY OF AQUATICS FIELD WORK AND ABORIGINAL FIELD STUDY PARTICIPATION FOR THE PROJECT,Table,Application for North Montney Project,North Montney,11/8/2013,NOVA Gas Transmission Ltd.,Gas,B2-16 ESA_Appendix_G_Part1of4 (A3Q6H2),https://apps.cer-rec.gc.ca/REGDOCS/Item/LoadResult/1060040,A3Q6H2,1059614,https://apps.cer-rec.gc.ca/REGDOCS/File/Download/1059614,Large Projects (over 40 km),"British Columbia, All",GH-001-2014,"Stantec Consulting Ltd., TERA Environmental Consultants",Operating,GC-125,https://apps.cer-rec.gc.ca/REGDOCS/Item/View/1060220,https://apps.cer-rec.gc.ca/REGDOCS/Item/View/3890551,Appendix G: TERA Aquatics Summary Report,15.0,"Water, All",http://www.cer-rec.gc.ca/esa-ees/nrthmntn/1059614_14_1.csv,14,48.0,5.87,No,nrthmntn,http://www.cer-rec.gc.ca/esa-ees/nrthmntn.zip,False,nrthmntn_table-3-summary-of-aquatics-field-work-and-aborigi
1,9135,TABLE 4 SUMMARY OF WATERCOURSE CROSSINGS ALONG THE NORTH MONTNEY MAINLINE (AITKEN CREEK SECTION),Table,Application for North Montney Project,North Montney,11/8/2013,NOVA Gas Transmission Ltd.,Gas,B2-16 ESA_Appendix_G_Part1of4 (A3Q6H2),https://apps.cer-rec.gc.ca/REGDOCS/Item/LoadResult/1060040,A3Q6H2,1059614,https://apps.cer-rec.gc.ca/REGDOCS/File/Download/1059614,Large Projects (over 40 km),"British Columbia, All",GH-001-2014,"Stantec Consulting Ltd., TERA Environmental Consultants",Operating,GC-125,https://apps.cer-rec.gc.ca/REGDOCS/Item/View/1060220,https://apps.cer-rec.gc.ca/REGDOCS/Item/View/3890551,Appendix G: TERA Aquatics Summary Report,15.0,"Water, All",http://www.cer-rec.gc.ca/esa-ees/nrthmntn/1059614_17_1.csv,17,48.0,5.87,No,nrthmntn,http://www.cer-rec.gc.ca/esa-ees/nrthmntn.zip,False,nrthmntn_table-4-summary-of-watercourse-crossings-along-the
2,9136,TABLE 4 SUMMARY OF WATERCOURSE CROSSINGS ALONG THE NORTH MONTNEY MAINLINE (AITKEN CREEK SECTION),Table,Application for North Montney Project,North Montney,11/8/2013,NOVA Gas Transmission Ltd.,Gas,B2-16 ESA_Appendix_G_Part1of4 (A3Q6H2),https://apps.cer-rec.gc.ca/REGDOCS/Item/LoadResult/1060040,A3Q6H2,1059614,https://apps.cer-rec.gc.ca/REGDOCS/File/Download/1059614,Large Projects (over 40 km),"British Columbia, All",GH-001-2014,"Stantec Consulting Ltd., TERA Environmental Consultants",Operating,GC-125,https://apps.cer-rec.gc.ca/REGDOCS/Item/View/1060220,https://apps.cer-rec.gc.ca/REGDOCS/Item/View/3890551,Appendix G: TERA Aquatics Summary Report,15.0,"Water, All",http://www.cer-rec.gc.ca/esa-ees/nrthmntn/1059614_18_1.csv,18,48.0,5.87,No,nrthmntn,http://www.cer-rec.gc.ca/esa-ees/nrthmntn.zip,False,nrthmntn_table-4-summary-of-watercourse-crossings-along-the
3,9137,TABLE 4 SUMMARY OF WATERCOURSE CROSSINGS ALONG THE NORTH MONTNEY MAINLINE (AITKEN CREEK SECTION),Table,Application for North Montney Project,North Montney,11/8/2013,NOVA Gas Transmission Ltd.,Gas,B2-16 ESA_Appendix_G_Part1of4 (A3Q6H2),https://apps.cer-rec.gc.ca/REGDOCS/Item/LoadResult/1060040,A3Q6H2,1059614,https://apps.cer-rec.gc.ca/REGDOCS/File/Download/1059614,Large Projects (over 40 km),"British Columbia, All",GH-001-2014,"Stantec Consulting Ltd., TERA Environmental Consultants",Operating,GC-125,https://apps.cer-rec.gc.ca/REGDOCS/Item/View/1060220,https://apps.cer-rec.gc.ca/REGDOCS/Item/View/3890551,Appendix G: TERA Aquatics Summary Report,15.0,"Water, All",http://www.cer-rec.gc.ca/esa-ees/nrthmntn/1059614_19_1.csv,19,48.0,5.87,No,nrthmntn,http://www.cer-rec.gc.ca/esa-ees/nrthmntn.zip,False,nrthmntn_table-4-summary-of-watercourse-crossings-along-the
4,9138,TABLE 4 SUMMARY OF WATERCOURSE CROSSINGS ALONG THE NORTH MONTNEY MAINLINE (AITKEN CREEK SECTION),Table,Application for North Montney Project,North Montney,11/8/2013,NOVA Gas Transmission Ltd.,Gas,B2-16 ESA_Appendix_G_Part1of4 (A3Q6H2),https://apps.cer-rec.gc.ca/REGDOCS/Item/LoadResult/1060040,A3Q6H2,1059614,https://apps.cer-rec.gc.ca/REGDOCS/File/Download/1059614,Large Projects (over 40 km),"British Columbia, All",GH-001-2014,"Stantec Consulting Ltd., TERA Environmental Consultants",Operating,GC-125,https://apps.cer-rec.gc.ca/REGDOCS/Item/View/1060220,https://apps.cer-rec.gc.ca/REGDOCS/Item/View/3890551,Appendix G: TERA Aquatics Summary Report,15.0,"Water, All",http://www.cer-rec.gc.ca/esa-ees/nrthmntn/1059614_20_1.csv,20,48.0,5.87,No,nrthmntn,http://www.cer-rec.gc.ca/esa-ees/nrthmntn.zip,False,nrthmntn_table-4-summary-of-watercourse-crossings-along-the


In [86]:
# Creating a column with the old filename so that we can rename the files
old_filename_df = df['CSV Download URL'].str.split('/').str[-1].str.split('_')
df['old_filename'] = old_filename_df.str[0] + '_' + old_filename_df.str[1] + '_lattice-v_' + old_filename_df.str[2]

vieux_nom_du_fichier_df = df_FRA['URL de téléchargement CSV'].str.split('/').str[-1].str.split('_')
df_FRA['vieux_nom_de_fichier'] = vieux_nom_du_fichier_df.str[0] + '_' + vieux_nom_du_fichier_df.str[1] + '_lattice-v_' + vieux_nom_du_fichier_df.str[2]

In [87]:
df['old_filename']

0        1059614_14_lattice-v_1.csv
1        1059614_17_lattice-v_1.csv
2        1059614_18_lattice-v_1.csv
3        1059614_19_lattice-v_1.csv
4        1059614_20_lattice-v_1.csv
                    ...            
28886     895339_10_lattice-v_1.csv
28887     895339_12_lattice-v_1.csv
28888     895339_13_lattice-v_1.csv
28889     895339_14_lattice-v_1.csv
28890     895339_23_lattice-v_1.csv
Name: old_filename, Length: 28891, dtype: object

In [88]:
%%time
# We add a counter for all CSVs connected to the same table
# For the English index file
prev_title = ''
for index, row in df.iterrows():
    current_title = row['filename']
    if current_title == prev_title:
        i += 1
        current_title = current_title + '_pt-' + str(i) + '_pg-' + str(row['PDF Page Number']) + '_doc-num-' + str(row['Document Number']) + '.csv'
    else:
        i = 1
        current_title = current_title + '_pt-' + str(i) + '_pg-' + str(row['PDF Page Number']) + '_doc-num-' + str(row['Document Number']) + '.csv'
    
    df.loc[index, 'filename'] = current_title
    df.loc[index, 'CSV Download URL'] = os.path.join('http://www.cer-rec.gc.ca/esa-ees/', row['Download folder name'] + '/' + current_title)
    prev_title = row['filename']

Wall time: 1min 34s


In [89]:
df.head(5)

Unnamed: 0,Index,Title,Content Type,Application Name,Application Short Name,Application Filing Date,Company Name,Commodity,File Name,ESA Folder URL,Document Number,Data ID,PDF Download URL,Application Type (NEB Act),Pipeline Location,Hearing order,Consultant Name,Pipeline Status,Regulatory Instrument(s),Application URL,Decision URL,ESA Section(s),ESA Section(s) Index,ESA Section(s) Topics,CSV Download URL,PDF Page Number,PDF Page Count,PDF Size,PDF Outline,Download folder name,Zipped Project Link,bad_csv,filename,old_filename
0,9134,TABLE 3 SUMMARY OF AQUATICS FIELD WORK AND ABORIGINAL FIELD STUDY PARTICIPATION FOR THE PROJECT,Table,Application for North Montney Project,North Montney,11/8/2013,NOVA Gas Transmission Ltd.,Gas,B2-16 ESA_Appendix_G_Part1of4 (A3Q6H2),https://apps.cer-rec.gc.ca/REGDOCS/Item/LoadResult/1060040,A3Q6H2,1059614,https://apps.cer-rec.gc.ca/REGDOCS/File/Download/1059614,Large Projects (over 40 km),"British Columbia, All",GH-001-2014,"Stantec Consulting Ltd., TERA Environmental Consultants",Operating,GC-125,https://apps.cer-rec.gc.ca/REGDOCS/Item/View/1060220,https://apps.cer-rec.gc.ca/REGDOCS/Item/View/3890551,Appendix G: TERA Aquatics Summary Report,15.0,"Water, All",http://www.cer-rec.gc.ca/esa-ees/nrthmntn/nrthmntn_table-3-summary-of-aquatics-field-work-and-aborigi_pt-1_pg-14_doc-num-A3Q6H2.csv,14,48.0,5.87,No,nrthmntn,http://www.cer-rec.gc.ca/esa-ees/nrthmntn.zip,False,nrthmntn_table-3-summary-of-aquatics-field-work-and-aborigi_pt-1_pg-14_doc-num-A3Q6H2.csv,1059614_14_lattice-v_1.csv
1,9135,TABLE 4 SUMMARY OF WATERCOURSE CROSSINGS ALONG THE NORTH MONTNEY MAINLINE (AITKEN CREEK SECTION),Table,Application for North Montney Project,North Montney,11/8/2013,NOVA Gas Transmission Ltd.,Gas,B2-16 ESA_Appendix_G_Part1of4 (A3Q6H2),https://apps.cer-rec.gc.ca/REGDOCS/Item/LoadResult/1060040,A3Q6H2,1059614,https://apps.cer-rec.gc.ca/REGDOCS/File/Download/1059614,Large Projects (over 40 km),"British Columbia, All",GH-001-2014,"Stantec Consulting Ltd., TERA Environmental Consultants",Operating,GC-125,https://apps.cer-rec.gc.ca/REGDOCS/Item/View/1060220,https://apps.cer-rec.gc.ca/REGDOCS/Item/View/3890551,Appendix G: TERA Aquatics Summary Report,15.0,"Water, All",http://www.cer-rec.gc.ca/esa-ees/nrthmntn/nrthmntn_table-4-summary-of-watercourse-crossings-along-the_pt-1_pg-17_doc-num-A3Q6H2.csv,17,48.0,5.87,No,nrthmntn,http://www.cer-rec.gc.ca/esa-ees/nrthmntn.zip,False,nrthmntn_table-4-summary-of-watercourse-crossings-along-the_pt-1_pg-17_doc-num-A3Q6H2.csv,1059614_17_lattice-v_1.csv
2,9136,TABLE 4 SUMMARY OF WATERCOURSE CROSSINGS ALONG THE NORTH MONTNEY MAINLINE (AITKEN CREEK SECTION),Table,Application for North Montney Project,North Montney,11/8/2013,NOVA Gas Transmission Ltd.,Gas,B2-16 ESA_Appendix_G_Part1of4 (A3Q6H2),https://apps.cer-rec.gc.ca/REGDOCS/Item/LoadResult/1060040,A3Q6H2,1059614,https://apps.cer-rec.gc.ca/REGDOCS/File/Download/1059614,Large Projects (over 40 km),"British Columbia, All",GH-001-2014,"Stantec Consulting Ltd., TERA Environmental Consultants",Operating,GC-125,https://apps.cer-rec.gc.ca/REGDOCS/Item/View/1060220,https://apps.cer-rec.gc.ca/REGDOCS/Item/View/3890551,Appendix G: TERA Aquatics Summary Report,15.0,"Water, All",http://www.cer-rec.gc.ca/esa-ees/nrthmntn/nrthmntn_table-4-summary-of-watercourse-crossings-along-the_pt-2_pg-18_doc-num-A3Q6H2.csv,18,48.0,5.87,No,nrthmntn,http://www.cer-rec.gc.ca/esa-ees/nrthmntn.zip,False,nrthmntn_table-4-summary-of-watercourse-crossings-along-the_pt-2_pg-18_doc-num-A3Q6H2.csv,1059614_18_lattice-v_1.csv
3,9137,TABLE 4 SUMMARY OF WATERCOURSE CROSSINGS ALONG THE NORTH MONTNEY MAINLINE (AITKEN CREEK SECTION),Table,Application for North Montney Project,North Montney,11/8/2013,NOVA Gas Transmission Ltd.,Gas,B2-16 ESA_Appendix_G_Part1of4 (A3Q6H2),https://apps.cer-rec.gc.ca/REGDOCS/Item/LoadResult/1060040,A3Q6H2,1059614,https://apps.cer-rec.gc.ca/REGDOCS/File/Download/1059614,Large Projects (over 40 km),"British Columbia, All",GH-001-2014,"Stantec Consulting Ltd., TERA Environmental Consultants",Operating,GC-125,https://apps.cer-rec.gc.ca/REGDOCS/Item/View/1060220,https://apps.cer-rec.gc.ca/REGDOCS/Item/View/3890551,Appendix G: TERA Aquatics Summary Report,15.0,"Water, All",http://www.cer-rec.gc.ca/esa-ees/nrthmntn/nrthmntn_table-4-summary-of-watercourse-crossings-along-the_pt-3_pg-19_doc-num-A3Q6H2.csv,19,48.0,5.87,No,nrthmntn,http://www.cer-rec.gc.ca/esa-ees/nrthmntn.zip,False,nrthmntn_table-4-summary-of-watercourse-crossings-along-the_pt-3_pg-19_doc-num-A3Q6H2.csv,1059614_19_lattice-v_1.csv
4,9138,TABLE 4 SUMMARY OF WATERCOURSE CROSSINGS ALONG THE NORTH MONTNEY MAINLINE (AITKEN CREEK SECTION),Table,Application for North Montney Project,North Montney,11/8/2013,NOVA Gas Transmission Ltd.,Gas,B2-16 ESA_Appendix_G_Part1of4 (A3Q6H2),https://apps.cer-rec.gc.ca/REGDOCS/Item/LoadResult/1060040,A3Q6H2,1059614,https://apps.cer-rec.gc.ca/REGDOCS/File/Download/1059614,Large Projects (over 40 km),"British Columbia, All",GH-001-2014,"Stantec Consulting Ltd., TERA Environmental Consultants",Operating,GC-125,https://apps.cer-rec.gc.ca/REGDOCS/Item/View/1060220,https://apps.cer-rec.gc.ca/REGDOCS/Item/View/3890551,Appendix G: TERA Aquatics Summary Report,15.0,"Water, All",http://www.cer-rec.gc.ca/esa-ees/nrthmntn/nrthmntn_table-4-summary-of-watercourse-crossings-along-the_pt-4_pg-20_doc-num-A3Q6H2.csv,20,48.0,5.87,No,nrthmntn,http://www.cer-rec.gc.ca/esa-ees/nrthmntn.zip,False,nrthmntn_table-4-summary-of-watercourse-crossings-along-the_pt-4_pg-20_doc-num-A3Q6H2.csv,1059614_20_lattice-v_1.csv


In [90]:
%%time
# For French index file
prev_title = ''
for index, row in df_FRA.iterrows():
    current_title = row['nom_du_fichier']
    if current_title == prev_title:
        i += 1
        current_title = current_title + '_pt-' + str(i) + '_pg-' + str(row['Numéro de page PDF']) + '_num-du-doc-' + str(row['Numéro de document']) + '.csv'
    else:
        i = 1
        current_title = current_title + '_pt-' + str(i) + '_pg-' + str(row['Numéro de page PDF']) + '_num-du-doc-' + str(row['Numéro de document']) + '.csv'
    
    df_FRA.loc[index, 'nom_du_fichier'] = current_title
    df_FRA.loc[index, 'URL de téléchargement CSV'] = os.path.join('http://www.cer-rec.gc.ca/esa-ees/', row['Télécharger le nom du dossier'] + '/' + current_title)
    prev_title = row['nom_du_fichier']

Wall time: 1min 34s


In [91]:
df_FRA.head(5)

Unnamed: 0,Indice,Titre,Type de contenu,Nom de la demande,Nom abrégé de la demande,Dépôt de la demande,Nom de la société,Produit de base,Nom de fichier,URL du dossier de lÉES,Numéro de document,Identificateur de données,URL de téléchargement PDF,Type de demande (Loi sur lOffice national de lénergie),Emplacement du pipeline,Ordonnance daudience,Nom du consultant,État d'avancement,Instruments réglementaires,URL de la demande,URL de la décision,Sections de lEES,Index des sections de lÉES,Sujets des sections de lÉES,URL de téléchargement CSV,Numéro de page PDF,Nombre de pages PDF,Taille PDF,Aperçu PDF,Télécharger le nom du dossier,Lien vers le projet compressé,mauvais_csv,nom_du_fichier,vieux_nom_de_fichier
0,9134,TABLEAU 3 RÉSUMÉ DU TRAVAIL SUR LE TERRAIN AQUATIQUE ET DE LA PARTICIPATION DES AUTOCHTONES À LÉTUDE SUR LE TERRAIN POUR LE PROJET,Tableau,Demande visant le projet North Montney,North Montney,11/8/2013,NOVA Gas Transmission Ltd.,Gaz,B2-16 ESA_Appendix_G_Part1of4 (A3Q6H2),https://apps.cer-rec.gc.ca/REGDOCS/Item/LoadResult/1060040,A3Q6H2,1059614,https://apps.cer-rec.gc.ca/REGDOCS/File/Download/1059614,Projets de grande envergure (plus de 40 km),Colombie britannique,GH-001-2014,"Stantec Consulting Ltd., TERA Environmental Consultants",En exploitation,GC-125,https://apps.cer-rec.gc.ca/REGDOCS/Item/View/1060220,https://apps.cer-rec.gc.ca/REGDOCS/%C3%89l%C3%A9ment/Afficher/3890551,Annexe G : Rapport sommaire de TERA sur le milieu aquatique,15.0,Eau,http://www.cer-rec.gc.ca/esa-ees/nrthmntn/nrthmntn_tableau-3-résumé-du-travail-sur-le-terrain-aquatiq_pt-1_pg-14_num-du-doc-A3Q6H2.csv,14,48.0,5.87,Non,nrthmntn,http://www.cer-rec.gc.ca/esa-ees/nrthmntn.zip,False,nrthmntn_tableau-3-résumé-du-travail-sur-le-terrain-aquatiq_pt-1_pg-14_num-du-doc-A3Q6H2.csv,1059614_14_lattice-v_1.csv
1,9135,TABLEAU 4 RÉSUMÉ DES FRANCHISSEMENTS DE COURS DEAU DE LA CANALISATION PRINCIPALE NORTH MONTNEY (TRONÇON AITKEN CREEK),Tableau,Demande visant le projet North Montney,North Montney,11/8/2013,NOVA Gas Transmission Ltd.,Gaz,B2-16 ESA_Appendix_G_Part1of4 (A3Q6H2),https://apps.cer-rec.gc.ca/REGDOCS/Item/LoadResult/1060040,A3Q6H2,1059614,https://apps.cer-rec.gc.ca/REGDOCS/File/Download/1059614,Projets de grande envergure (plus de 40 km),Colombie britannique,GH-001-2014,"Stantec Consulting Ltd., TERA Environmental Consultants",En exploitation,GC-125,https://apps.cer-rec.gc.ca/REGDOCS/Item/View/1060220,https://apps.cer-rec.gc.ca/REGDOCS/%C3%89l%C3%A9ment/Afficher/3890551,Annexe G : Rapport sommaire de TERA sur le milieu aquatique,15.0,Eau,http://www.cer-rec.gc.ca/esa-ees/nrthmntn/nrthmntn_tableau-4-résumé-des-franchissements-de-cours-deau_pt-1_pg-17_num-du-doc-A3Q6H2.csv,17,48.0,5.87,Non,nrthmntn,http://www.cer-rec.gc.ca/esa-ees/nrthmntn.zip,False,nrthmntn_tableau-4-résumé-des-franchissements-de-cours-deau_pt-1_pg-17_num-du-doc-A3Q6H2.csv,1059614_17_lattice-v_1.csv
2,9136,TABLEAU 4 RÉSUMÉ DES FRANCHISSEMENTS DE COURS DEAU DE LA CANALISATION PRINCIPALE NORTH MONTNEY (TRONÇON AITKEN CREEK),Tableau,Demande visant le projet North Montney,North Montney,11/8/2013,NOVA Gas Transmission Ltd.,Gaz,B2-16 ESA_Appendix_G_Part1of4 (A3Q6H2),https://apps.cer-rec.gc.ca/REGDOCS/Item/LoadResult/1060040,A3Q6H2,1059614,https://apps.cer-rec.gc.ca/REGDOCS/File/Download/1059614,Projets de grande envergure (plus de 40 km),Colombie britannique,GH-001-2014,"Stantec Consulting Ltd., TERA Environmental Consultants",En exploitation,GC-125,https://apps.cer-rec.gc.ca/REGDOCS/Item/View/1060220,https://apps.cer-rec.gc.ca/REGDOCS/%C3%89l%C3%A9ment/Afficher/3890551,Annexe G : Rapport sommaire de TERA sur le milieu aquatique,15.0,Eau,http://www.cer-rec.gc.ca/esa-ees/nrthmntn/nrthmntn_tableau-4-résumé-des-franchissements-de-cours-deau_pt-2_pg-18_num-du-doc-A3Q6H2.csv,18,48.0,5.87,Non,nrthmntn,http://www.cer-rec.gc.ca/esa-ees/nrthmntn.zip,False,nrthmntn_tableau-4-résumé-des-franchissements-de-cours-deau_pt-2_pg-18_num-du-doc-A3Q6H2.csv,1059614_18_lattice-v_1.csv
3,9137,TABLEAU 4 RÉSUMÉ DES FRANCHISSEMENTS DE COURS DEAU DE LA CANALISATION PRINCIPALE NORTH MONTNEY (TRONÇON AITKEN CREEK),Tableau,Demande visant le projet North Montney,North Montney,11/8/2013,NOVA Gas Transmission Ltd.,Gaz,B2-16 ESA_Appendix_G_Part1of4 (A3Q6H2),https://apps.cer-rec.gc.ca/REGDOCS/Item/LoadResult/1060040,A3Q6H2,1059614,https://apps.cer-rec.gc.ca/REGDOCS/File/Download/1059614,Projets de grande envergure (plus de 40 km),Colombie britannique,GH-001-2014,"Stantec Consulting Ltd., TERA Environmental Consultants",En exploitation,GC-125,https://apps.cer-rec.gc.ca/REGDOCS/Item/View/1060220,https://apps.cer-rec.gc.ca/REGDOCS/%C3%89l%C3%A9ment/Afficher/3890551,Annexe G : Rapport sommaire de TERA sur le milieu aquatique,15.0,Eau,http://www.cer-rec.gc.ca/esa-ees/nrthmntn/nrthmntn_tableau-4-résumé-des-franchissements-de-cours-deau_pt-3_pg-19_num-du-doc-A3Q6H2.csv,19,48.0,5.87,Non,nrthmntn,http://www.cer-rec.gc.ca/esa-ees/nrthmntn.zip,False,nrthmntn_tableau-4-résumé-des-franchissements-de-cours-deau_pt-3_pg-19_num-du-doc-A3Q6H2.csv,1059614_19_lattice-v_1.csv
4,9138,TABLEAU 4 RÉSUMÉ DES FRANCHISSEMENTS DE COURS DEAU DE LA CANALISATION PRINCIPALE NORTH MONTNEY (TRONÇON AITKEN CREEK),Tableau,Demande visant le projet North Montney,North Montney,11/8/2013,NOVA Gas Transmission Ltd.,Gaz,B2-16 ESA_Appendix_G_Part1of4 (A3Q6H2),https://apps.cer-rec.gc.ca/REGDOCS/Item/LoadResult/1060040,A3Q6H2,1059614,https://apps.cer-rec.gc.ca/REGDOCS/File/Download/1059614,Projets de grande envergure (plus de 40 km),Colombie britannique,GH-001-2014,"Stantec Consulting Ltd., TERA Environmental Consultants",En exploitation,GC-125,https://apps.cer-rec.gc.ca/REGDOCS/Item/View/1060220,https://apps.cer-rec.gc.ca/REGDOCS/%C3%89l%C3%A9ment/Afficher/3890551,Annexe G : Rapport sommaire de TERA sur le milieu aquatique,15.0,Eau,http://www.cer-rec.gc.ca/esa-ees/nrthmntn/nrthmntn_tableau-4-résumé-des-franchissements-de-cours-deau_pt-4_pg-20_num-du-doc-A3Q6H2.csv,20,48.0,5.87,Non,nrthmntn,http://www.cer-rec.gc.ca/esa-ees/nrthmntn.zip,False,nrthmntn_tableau-4-résumé-des-franchissements-de-cours-deau_pt-4_pg-20_num-du-doc-A3Q6H2.csv,1059614_20_lattice-v_1.csv


In [92]:
# Adding an index ID to each file to avoid duplicates
# Add only if there are duplicates
# df['filename'] = df['filename'] + '-' + 'no' + df['Index'].astype(str) + '.csv'
# df_FRA['nom_du_fichier'] = df_FRA['nom_du_fichier'] + '-' + 'no' + df_FRA['Indice'].astype(str) + '.csv'

In [93]:
# Making sure there are no duplicates in English filenames
assert len(df) - len(df['filename'].unique()) == 0, "Should be 0. Instead, it is " + str(len(df) - len(df['filename'].unique())) + '.'

In [94]:
# Making sure there are no duplicates in French filenames
assert len(df_FRA) - len(df_FRA['nom_du_fichier'].unique()) == 0, "Should be 0. Instead, it is " + str(len(df_FRA) - len(df_FRA['nom_du_fichier'].unique())) + '.'

In [95]:
df['bad_csv'].value_counts()

False    26665
True      2226
Name: bad_csv, dtype: int64

In [44]:
# Where the CSVs are located
csv_folder_path_ENG = 'F:/Environmental Baseline Data/Version 4 - Final/all_csvs_cleaned_latest_ENG/'
csv_folder_path_FRA = 'F:/Environmental Baseline Data/Version 4 - Final/all_csvs_cleaned_latest_FRA/'

In [45]:
%%time
# English CSVs
# If the final file has been renamed, it will skip the renaming loop
os.chdir(csv_folder_path_ENG)
if os.path.isfile(df['old_filename'].iloc[-1]):
  #loop through the name and rename
    for index, row in df.iterrows():
        if os.path.isfile(row['old_filename']):
            if row['bad_csv'] == False:
                shutil.move(row['old_filename'], row['filename'])
            else:
                os.remove(row['old_filename'])

Wall time: 5min 48s


In [46]:
%%time
# French CSVs
# If the final file has been renamed, it will skip the renaming loop
os.chdir(csv_folder_path_FRA)
if os.path.isfile(df_FRA['vieux_nom_de_fichier'].iloc[-1]):
  #loop through the name and rename
    for index, row in df_FRA.iterrows():
        if os.path.isfile(row['vieux_nom_de_fichier']):
            if row['mauvais_csv'] == False:
                shutil.move(row['vieux_nom_de_fichier'], row['nom_du_fichier'])
            else:
                os.remove(row['vieux_nom_de_fichier']) 

Wall time: 5min 47s


In [96]:
# Updating base path to Indices folder to save index files
os.chdir('F:/Environmental Baseline Data/Version 4 - Final/Indices/')

In [97]:
# Saving index files
df.to_csv('ESA_website_ENG_' + strftime("%Y_%m_%d", localtime()) + '_final' + '.csv', encoding='ISO-8859-1', index=False)
df_FRA.to_csv('ESA_website_FRA_' + strftime("%Y_%m_%d", localtime()) + '_final' + '.csv', encoding='ISO-8859-1', index=False)