# SMART WATER ANALYTICS - TEAM 4 INTERNS - EXPLORE AI

##### A PROJECT ON WATER AVAILABILITY MANAGEMENT

### Team Members

#### Viwe Mqaqa                             | South Africa | viwe@explore-datascience.net | Supervisor

##### Oreoluwa Onyekachi Olaiya  | Nigeria | olaiyaoreoluwa3@gmail.com
##### Christian Divinefavour           | Nigeria | contact.christiandivinefavour@gmail.com
##### Gabriel Asiegbu                      | Nigeria | gwap2@live.com
##### Titus Olang'                             | Kenya  | tityewanjohi@gmail.com
##### Michael Kanu                          | Nigeria | michaelkanu01@yahoo.com
##### Michael Omosebi                    | Nigeria | omosebimichael@live.com      

## Project Overview

The Acea Group is one of the leading Italian multiutility operators. Listed on the Italian Stock Exchange since 1999, the company manages and develops water and electricity networks and environmental services. Acea is the foremost Italian operator in the water services sector supplying 9 million inhabitants in Lazio, Tuscany, Umbria, Molise, Campania.

This project focuses on the water sector to help Acea Group preserve precious waterbodies. As it is easy to imagine, a water supply company struggles with the need to forecast the water level in a waterbody (water spring, lake, river, or aquifer) to handle daily consumption. During fall and winter waterbodies are refilled, but during spring and summer they start to drain. To help preserve the health of these waterbodies it is important to predict the most efficient water availability, in terms of level and water flow for each day of the year.


#### Kaggle Link: https://www.kaggle.com/competitions/acea-water-prediction
#### GitHub Repo: https://github.com/Christian-Divinefavour/smart-water-analytics-explore_ai_team4


## Data Overview

In [1]:
import pandas as pd
import numpy as np
import matplotlib as plt
import seaborn as sns

In [7]:
aquifers = pd.read_csv('Aquifer_Auser.csv')
rivers = pd.read_csv('River_Arno.csv')
lakes = pd.read_csv('Lake_Bilancino.csv')
spring = pd.read_csv('Water_Spring_Amiata.csv')

desc = pd.read_excel('datasets_description.xlsx')
#desc = desc[((desc.Database == 'Aquifer_Auser') & (desc.Database == 'River_Arno') & (desc.Database == 'Lake_Bilancino') & (desc.Database == 'Water_Spring_Amiata'))]
pd.set_option('display.max_rows', desc.shape[0]+1)
pd.options.display.max_colwidth = 500

desc

Unnamed: 0,Database,Description,Output
0,Aquifer_Auser,"Information about the Auser aquifer. This water body consists of two subsystems, that we call NORH and SOUTH, where the former partly influences the behaviour of the latter.\nThe levels of the NORTH sector are represented by the values of the SAL, PAG, CoS and DIEC wells, while the levels of the SOUTH sector by the LT2 well.","Depth_to_Groundwater_SAL, Depth_to_Groundwater_COS, Depth_to_Groundwater_LT2"
1,Water_Spring_Amiata,"Information about the Amiata aquifer. This aquifer is accessed through the Ermicciolo, Arbure, Bugnano and Galleria Alta springs. \nThe levels and volumes of the four springs are influenced by the parameters: pluviometry, sub-gradation, hydrometry, temperatures and drainage volumes.","Flow_Rate_Bugnano, Flow_Rate_Arbure,\n Flow_Rate_Ermicciolo, Flow_Rate_Galleria_Alta"
2,Aquifer_Petrignano,Information about Petrignano aquifer. \nIt is fed by three underground aquifers separated by low permeability septa; the water table can be considered groundwater and is also fed by the Chiascio river.,"Depth_to_Groundwater_P24,\n Depth_to_Groundwater_P25"
3,Aquifer_Doganella,Information about Doganella aquifer. The Doganella well field is fed by two underground aquifers: \nthe upper stratum is a water table with a thickness of about 30m while the lower one is a semi-confined artesian aquifer with a thickness of 50m.,"Depth_to_Groundwater_Pozzo_1, Depth_to_Groundwater_Pozzo_2, Depth_to_Groundwater_Pozzo_3, Depth_to_Groundwater_Pozzo_4, Depth_to_Groundwater_Pozzo_5, Depth_to_Groundwater_Pozzo_6,\nDepth_to_Groundwater_Pozzo_7, Depth_to_Groundwater_Pozzo_8, Depth_to_Groundwater_Pozzo_9"
4,Aquifer_Luco,"Information about Luco aquifer. It is an underground aquifer not fed by rivers or lakes but fed by meteoric infiltration\n and it is accessed through wells called Pozzo_1, Pozzo_3 and Pozzo_4.",Depth_to_Groundwater_Podere_Casetta
5,River_Arno,"Information about Arno river. The Arno is the second largest river in peninsular Italy and the main waterway in Tuscany and it has a relatively torrential regime, \ndue to the nature of the surrounding soils (marl and impermeable clays)",Hydrometry_Nave_di_Rosano
6,Lake_Bilancino,"Information about Bilancino Lake. It is an artificial lake in Mugello, in the province of Florence. \n It has a maximum depth of thirty-one metres and a surface area of 5 square kilometres.","Lake_Level, \nFlow_Rate"
7,Water_Spring_Madonna_di_Canneto,Information about Madonna di Canneto water spring. The Madonna di Canneto spring is situated at an altitude of 1010m above sea level in the Canneto valley. \nIt does not consist of an aquifer and its source is supplied by the water catchment area of the river Melfa,Flow_Rate
8,Water_Spring_Lupa,Information about Lupa water spring. It is located in the Arrone area and is used for drinking use.,Flow_Rate


In [3]:
pd.set_option('display.max_rows', aquifers.shape[0]+1)
aquifers.T

Unnamed: 0,0,1,2,3,4,5,6,7,8,9,...,8144,8145,8146,8147,8148,8149,8150,8151,8152,8153
Date,05/03/1998,06/03/1998,07/03/1998,08/03/1998,09/03/1998,10/03/1998,11/03/1998,12/03/1998,13/03/1998,14/03/1998,...,21/06/2020,22/06/2020,23/06/2020,24/06/2020,25/06/2020,26/06/2020,27/06/2020,28/06/2020,29/06/2020,30/06/2020
Rainfall_Gallicano,,,,,,,,,,,...,0,0,0,0,0,0,0,0,0,0
Rainfall_Pontetetto,,,,,,,,,,,...,0.4,0,0,0,0,0,0,0,0,0
Rainfall_Monte_Serra,,,,,,,,,,,...,0,0,0,0,0,0,0,0,0,0
Rainfall_Orentano,,,,,,,,,,,...,0,0,0,0,0,0,0,0,0,0
Rainfall_Borgo_a_Mozzano,,,,,,,,,,,...,0,0,0,0,0,0,0,0,0,0
Rainfall_Piaggione,,,,,,,,,,,...,0,0,0,0,0,0,0,0,0,0
Rainfall_Calavorno,,,,,,,,,,,...,0,0,0,0,0,0,0,0,0,0
Rainfall_Croce_Arcana,,,,,,,,,,,...,0,0,0,0,0,0,0,0,0,0
Rainfall_Tereglio_Coreglia_Antelminelli,,,,,,,,,,,...,0,0,0,0,0,0,0,0,0,0


In [4]:
pd.set_option('display.max_rows', rivers.shape[0]+1)
rivers.T

Unnamed: 0,0,1,2,3,4,5,6,7,8,9,...,8207,8208,8209,8210,8211,8212,8213,8214,8215,8216
Date,01/01/1998,02/01/1998,03/01/1998,04/01/1998,05/01/1998,06/01/1998,07/01/1998,08/01/1998,09/01/1998,10/01/1998,...,21/06/2020,22/06/2020,23/06/2020,24/06/2020,25/06/2020,26/06/2020,27/06/2020,28/06/2020,29/06/2020,30/06/2020
Rainfall_Le_Croci,,,,,,,,,,,...,0,0,0,0,0,0,0,0,0,0
Rainfall_Cavallina,,,,,,,,,,,...,0,0,0,0,0,0,0,0,0,0
Rainfall_S_Agata,,,,,,,,,,,...,0,0,0,0,0,0,0,0,0,0
Rainfall_Mangona,,,,,,,,,,,...,0,0,0,0,0,0,0,0,0,0
Rainfall_S_Piero,,,,,,,,,,,...,0,0,0,0,0,0,0,0,0,0
Rainfall_Vernio,,,,,,,,,,,...,,,,,,,,,,
Rainfall_Stia,,,,,,,,,,,...,,,,,,,,,,
Rainfall_Consuma,,,,,,,,,,,...,,,,,,,,,,
Rainfall_Incisa,,,,,,,,,,,...,,,,,,,,,,


In [5]:
pd.set_option('display.max_rows', lakes.shape[0]+1)
lakes.T

Unnamed: 0,0,1,2,3,4,5,6,7,8,9,...,6593,6594,6595,6596,6597,6598,6599,6600,6601,6602
Date,03/06/2002,04/06/2002,05/06/2002,06/06/2002,07/06/2002,08/06/2002,09/06/2002,10/06/2002,11/06/2002,12/06/2002,...,21/06/2020,22/06/2020,23/06/2020,24/06/2020,25/06/2020,26/06/2020,27/06/2020,28/06/2020,29/06/2020,30/06/2020
Rainfall_S_Piero,,,,,,,,,,,...,0,0,0,0,0,0,0,0,0,0
Rainfall_Mangona,,,,,,,,,,,...,0,0,0,0,0,0,0,0,0,0
Rainfall_S_Agata,,,,,,,,,,,...,0,0,0,0,0,0,0,0,0,0
Rainfall_Cavallina,,,,,,,,,,,...,0,0,0,0,0,0,0,0,0,0
Rainfall_Le_Croci,,,,,,,,,,,...,0,0,0,0,0,0,0,0,0,0
Temperature_Le_Croci,,,,,,,,,,,...,21.95,23.05,22.5,22.15,22.35,22.5,23.4,21.5,23.2,22.75
Lake_Level,249.43,249.43,249.43,249.43,249.44,249.56,249.57,249.58,249.57,249.57,...,250.88,250.86,250.86,250.86,250.87,250.85,250.84,250.83,250.82,250.8
Flow_Rate,0.31,0.31,0.31,0.31,0.31,0.38,0.38,0.48,0.48,0.48,...,0.6,0.6,0.6,0.6,0.6,0.6,0.6,0.6,0.6,0.6


In [6]:
pd.set_option('display.max_rows', spring.shape[0]+1)
spring.T

Unnamed: 0,0,1,2,3,4,5,6,7,8,9,...,7477,7478,7479,7480,7481,7482,7483,7484,7485,7486
Date,01/01/2000,02/01/2000,03/01/2000,04/01/2000,05/01/2000,06/01/2000,07/01/2000,08/01/2000,09/01/2000,10/01/2000,...,21/06/2020,22/06/2020,23/06/2020,24/06/2020,25/06/2020,26/06/2020,27/06/2020,28/06/2020,29/06/2020,30/06/2020
Rainfall_Castel_del_Piano,,,,,,,,,,,...,0,0.2,0,0,0,0,0,0,0,0
Rainfall_Abbadia_S_Salvatore,,,,,,,,,,,...,0,6.4,0,0,0,0,0,0,0,0
Rainfall_S_Fiora,,,,,,,,,,,...,0,3.2,0,0,0,0,0,0,0,0
Rainfall_Laghetto_Verde,,,,,,,,,,,...,0,5,0,0,0,0,0,0,0,0
Rainfall_Vetta_Amiata,,,,,,,,,,,...,0,0,0,0,0,0,0,0,0,0
Depth_to_Groundwater_S_Fiora_8,,,,,,,,,,,...,-38.39,-38.39,-38.39,-38.39,-38.38,-38.38,-38.38,-38.38,-38.37,-38.37
Depth_to_Groundwater_S_Fiora_11bis,,,,,,,,,,,...,-51.9,-51.9,-51.9,-51.9,-51.9,-51.89,-51.89,-51.89,-51.89,-51.89
Depth_to_Groundwater_David_Lazzaretti,,,,,,,,,,,...,,,,,-303.27,-303.27,-303.27,-303.27,-303.28,-303.27
Temperature_Abbadia_S_Salvatore,,,,,,,,,,,...,18.35,19.7,20.3,21.15,21.45,20.7,20.5,22.1,22.45,22


#### Aquifers

In [10]:
aquifers.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 8154 entries, 0 to 8153
Data columns (total 27 columns):
Date                                       8154 non-null object
Rainfall_Gallicano                         5295 non-null float64
Rainfall_Pontetetto                        5295 non-null float64
Rainfall_Monte_Serra                       5289 non-null float64
Rainfall_Orentano                          5295 non-null float64
Rainfall_Borgo_a_Mozzano                   5295 non-null float64
Rainfall_Piaggione                         4930 non-null float64
Rainfall_Calavorno                         5295 non-null float64
Rainfall_Croce_Arcana                      5295 non-null float64
Rainfall_Tereglio_Coreglia_Antelminelli    5295 non-null float64
Rainfall_Fabbriche_di_Vallico              5295 non-null float64
Depth_to_Groundwater_LT2                   4802 non-null float64
Depth_to_Groundwater_SAL                   4545 non-null float64
Depth_to_Groundwater_PAG                   3807 n