The dataset provided contains information on various aspects related to greenhouse farming, including outdoor and indoor climate, irrigation, actuators, setpoints, resource consumption, harvest, crop parameters, tomato quality, lab analysis, and root zone/slab information. Let's break down the key parameters and their descriptions in each category:

### Teams:
The dataset involves contributions from different teams, each with its own captain and members, participating in a challenge related to greenhouse farming.

### Greenhouse Information:
- The greenhouse consists of five compartments with a total area of 96 m² and a growing area of 62.5 m².

### Weather Data:
1. **Tout**: Outside temperature (°C)
2. **Rhout**: Outside relative humidity (%)
3. **Iglob**: Solar Radiation (W/m²)
4. **Windsp**: Wind speed (m/s)
5. **RadSum**: Radiation sum (J/cm²)
6. **Winddir**: Wind direction (Compass direction [0 to 128])
7. **Rain**: Rain (status: 1=rain, 0=dry)
8. **PARout**: PAR (Photosynthetically Active Radiation) weather measurement (µmol/m² s)
9. **Pyrgeo**: Heat emission from a pyrgeometer (W/m²)
10. **AbsHumOut**: Absolute humidity content of outside air (g/m³)
11. **Time**: Timestamps (5 min interval)

### Greenhouse Climate and Actuators:
- Parameters related to the indoor climate, status of actuators, and irrigation in the greenhouse.
1. **Tair**: Greenhouse air temperature (°C)
2. **Rhair**: Greenhouse relative humidity (%)
3. **CO2air**: CO2 concentration in the greenhouse (ppm)
4. **HumDef**: Greenhouse humidity deficit (g/m³)
5. **VentLee**: Leeward vents opening (%)
6. **Ventwind**: Windward vents opening (%)
7. **AssimLight**: HPS lamps status (on-off)
8. **EnScr**: Energy curtain opening (%)
9. **BlackScr**: Blackout curtain opening (%)
10. **PipeLow**: Rail pipe temperature (Lower circuit) (°C)
11. **PipeGrow**: Crop pipe temperature (Growth circuit) (°C)
12. **co2_dos**: CO2 dosing (kg/ha hour)
13. **Tot_PAR**: Total inside PAR (µmol/m² s)
14. **EC_drain_PC**: Drain EC (Electrical Conductivity) (dS/m)
15. **pH_drain_PC**: Drain pH
16. **Water_sup**: Cumulative number of minutes of irrigation in a day (minutes)
17. **Cum_irr**: Cumulative number of liters of irrigation in a day (L/m² day)

### Climate and Irrigation Setpoints:
- Setpoints for various climate and irrigation parameters.
1. **co2_sp**: CO2 setpoint (ppm)
2. **dx_sp**: Humidity deficit setpoint (g/m³)
3. **t_rail_min_sp**: Rail pipe minimum temperature setpoint (°C)
4. **t_grow_min_sp**: Crop pipe minimum temperature setpoint (°C)
5. And others...

### VIP (Realized Setpoints):
- Realized setpoints for various climate and irrigation parameters.

### Production:
- Parameters related to tomato production.
1. **ProdA**: Total tomato production quality class A (kg/m² at date of harvest)
2. **ProdB**: Total tomato production quality class B (kg/m² at date of harvest)
3. **avg_nr_harvested_trusses**: Average number of harvested trusses per stem
4. And others...

### Crop Parameters:
- Parameters related to crop development.
1. **Stem_elong**: Stem growth per week (cm/week)
2. **Stem_thick**: Stem thickness (mm)
3. **Cum_trusses**: Cumulative number of new set trusses on the stem
4. **stem_dens**: Stem density (Stems/m²)
5. **Plant_dens**: Plant density (Plants/m²)

### Resources:
- Consumption of various resources.
1. **Heat_cons**: Heating energy consumption (MJ/m² day)
2. **ElecHigh**: Electricity consumption (artificial light) during pick-hours (kWh/m² day)
3. **ElecLow**: Electricity consumption (artificial light) during off-pick-hours (kWh/m² day)
4. **CO2_cons**: CO2 consumption (kg/m² day)
5. **Irr**: Irrigation water consumption (L/m² day)
6. **Drain**: Drain water (L/m² day)

### Tomato Quality and Dry Matter Content:
- Parameters related to the quality of tomatoes.
1. **Flavour**: Flavour level (0=dislike, 100=like)
2. **TSS**: Total Soluble Solids (°Brix)
3. **Acid**: Titratable acid (mmol H3O+/100gr)
4. **%Juice**: Percentage juice pressed from the fruit wall of the tomato
5. And others...

### Lab Analysis:
- Parameters obtained from laboratory analysis of irrigation and drainage water.
1. **irr_PH, irr_EC, irr_NH4, ...**: Parameters related to irrigation water
2. **drain_PH, drain_EC, drain_NH4, ...**: Parameters related to drainage water

### Root Zone Data (Grodan Sensors):
- Parameters related to the root zone obtained from Grodan sensors.
1. **EC_slab1, EC_slab2**: Electrical Conductivity
2. **WC_slab1, WC_slab2**: Slab water content
3. **t_slab1, t_slab2**: Slab temperature

The dataset is structured with various parameters and their respective units, intervals, data types, and sources. It covers a wide range of information essential for analyzing and optimizing greenhouse farming operations.

Importing files

In [3]:
import pandas as pd

In [4]:
df1=pd.read_csv('Weather.csv')
df2=pd.read_csv('TomQuality.csv')
df3=pd.read_csv('Resources.csv')
df4=pd.read_csv('Production.csv')
df5=pd.read_csv('LabAnalysis.csv')
df6=pd.read_csv('GrodanSens.csv')
df8=pd.read_csv('CropParameters.csv')

In [6]:
df7 = pd.read_csv('GreenhouseClimate.csv', low_memory=False)

In [7]:
df1

Unnamed: 0,%time,AbsHumOut,Iglob,PARout,Pyrgeo,RadSum,Rain,Rhout,Tout,Winddir,Windsp
0,43815.00000,6.220954,0.0,0.000000e+00,-72.0,215.0,0.0,80.6,6.9,32.0,4.7
1,43815.00347,6.220954,0.0,0.000000e+00,-73.0,0.0,0.0,80.6,6.9,32.0,4.7
2,43815.00694,6.205565,0.0,0.000000e+00,-76.0,0.0,0.0,80.4,6.9,32.0,4.7
3,43815.01042,6.190173,0.0,0.000000e+00,-77.0,0.0,0.0,80.2,6.9,32.0,4.7
4,43815.01389,6.162624,0.0,0.000000e+00,-75.0,0.0,0.0,80.9,6.7,32.0,4.7
...,...,...,...,...,...,...,...,...,...,...,...
47804,43980.98611,9.286397,0.0,9.999999e-01,-85.0,2992.0,0.0,71.4,15.1,2.0,4.3
47805,43980.98958,9.242139,0.0,1.000000e+00,-85.0,2992.0,0.0,71.5,15.0,2.0,4.3
47806,43980.99306,9.152067,0.0,3.350000e-08,-84.0,2992.0,0.0,70.8,15.0,2.0,3.8
47807,43980.99653,9.177802,0.0,0.000000e+00,-85.0,2992.0,0.0,71.0,15.0,2.0,3.8


In [8]:
df2

Unnamed: 0,%time,Flavour,TSS,Acid,%Juice,Bite,Weight,DMC_fruit
0,43880,77,8.0,14.5,66,179,8.97,
1,43894,77,8.6,14.5,63,274,8.8,
2,43908,73,8.4,14.0,56,315,10.0,8.19
3,43922,79,9.2,14.4,60,382,9.1,9.31
4,43936,80,9.3,14.5,61,288,8.5,9.46
5,43950,77,9.0,13.0,61,300,9.9,8.75
6,43964,82,9.6,13.2,66,238,9.97,8.87
7,43980,72,8.2,11.8,64,201,11.6,9.33


In [9]:
df3

Unnamed: 0,%Time,Heat_cons,ElecHigh,ElecLow,CO2_cons,Irr,Drain
0,43815,2.71,1.1,0.0,0.007,0.00,0.00
1,43816,0.92,1.0,0.0,0.009,0.00,0.00
2,43817,0.97,1.0,0.7,0.014,0.00,0.00
3,43818,0.10,0.9,0.5,0.025,0.00,0.00
4,43819,2.24,0.8,0.5,0.017,0.76,0.00
...,...,...,...,...,...,...,...
161,43976,0.43,0.0,0.0,0.065,5.40,2.90
162,43977,0.42,0.0,0.0,0.061,5.40,2.69
163,43978,0.44,0.0,0.0,0.088,5.40,2.33
164,43979,0.98,0.0,0.0,0.101,5.76,2.94


In [10]:
df4

Unnamed: 0,%time,ProdA,ProdB,avg_nr_harvested_trusses,Truss development time,Nr_fruits_ClassA,Weight_fruits_ClassA,Nr_fruits_ClassB,Weight_fruits_ClassB
0,43880,0.037,0.0,0.1,50.0,,128.0,0,0
1,43885,0.767,0.0,0.9,54.5,136.0,1271.0,0,0
2,43889,0.232,0.0,0.6,51.0,89.0,788.0,0,0
3,43894,0.778,0.0,1.5,55.1,226.0,2001.0,0,0
4,43899,0.248,0.0,0.9,53.6,133.0,1144.0,0,0
5,43903,0.354,0.0,0.7,52.4,83.0,811.0,0,0
6,43908,0.795,0.0,1.0,53.3,126.0,1230.0,0,0
7,43913,0.486,0.0,0.9,54.9,148.0,1297.5,0,0
8,43917,0.454,0.0,0.9,52.6,126.0,1069.0,0,0
9,43922,0.697,0.0,0.6,54.0,92.0,777.0,0,0


In [11]:
df5

Unnamed: 0,%Time,irr_PH,irr_EC,irr_NH4,irr_K,irr_Na,irr_Ca,irr_Mg,irr_Si,irr_NO3,...,drain_Cl,drain_SO4,drain_HCO3,drain_PO4,drain_Fe,drain_Mn,drain_Zn,drain_B,drain_Cu,drain_Mo
0,43836,5.3,4.3,1.1,15.2,0.3,9.6,4.9,0.09,21.7,...,7.6,13.5,1.4,2.0,27.8,7.7,5.7,49.0,0.9,0.61
1,43850,5.2,4.0,1.1,11.3,0.3,9.8,4.1,0.09,21.0,...,10.0,23.4,1.3,0.42,21.1,0.8,6.4,56.0,0.7,0.94
2,43864,4.8,3.9,1.3,11.6,0.3,10.2,3.5,0.09,18.6,...,12.0,17.5,1.0,1.3,18.3,1.1,3.7,93.0,0.5,0.47
3,43879,4.8,3.7,1.4,13.3,0.4,8.2,2.8,0.09,16.0,...,20.0,13.2,0.1,6.1,24.4,4.5,3.8,134.0,1.0,0.13
4,43893,4.7,3.5,1.1,11.3,0.4,7.6,2.5,0.01,14.1,...,21.2,11.6,0.1,9.32,31.0,11.0,6.5,113.0,1.1,0.1
5,43908,5.4,3.4,1.3,13.4,0.3,6.5,1.9,0.009,17.1,...,4.3,12.3,0.09,6.41,35.0,9.8,8.2,60.0,1.1,0.1
6,43921,5.6,3.1,1.2,11.0,0.2,5.6,2.0,0.009,16.9,...,0.1,11.0,0.09,5.6,30.0,13.0,9.5,37.0,1.1,0.1
7,43936,5.7,2.9,1.2,9.3,0.3,6.4,2.7,0.009,16.6,...,0.09,11.3,0.09,5.99,27.0,15.0,8.3,66.0,0.8,0.09
8,43951,6.0,3.6,0.8,13.6,0.3,8.3,3.2,0.01,17.0,...,0.09,9.9,0.09,5.2,35.0,15.0,9.9,68.0,0.8,0.09
9,43963,5.7,3.0,0.8,9.1,0.2,6.2,2.1,0.01,15.7,...,0.2,12.3,0.09,5.99,53.0,27.0,13.0,86.0,1.0,0.09


In [12]:
df6

Unnamed: 0,%time,EC_slab1,EC_slab2,WC_slab1,WC_slab2,t_slab1,t_slab2
0,43815.00000,,,,,,
1,43815.00347,,,,,,
2,43815.00694,,,,,,
3,43815.01042,,,,,,
4,43815.01389,,,,,,
...,...,...,...,...,...,...,...
47804,43980.98611,,,,,,
47805,43980.98958,,,,,,
47806,43980.99306,,,,,,
47807,43980.99653,,,,,,


In [13]:
df7

Unnamed: 0,%time,AssimLight,BlackScr,CO2air,Cum_irr,EC_drain_PC,EnScr,HumDef,PipeGrow,PipeLow,...,t_rail_min_sp,t_rail_min_vip,t_vent_sp,t_ventlee_vip,t_ventwind_vip,water_sup,water_sup_intervals_sp_min,water_sup_intervals_vip_min,window_pos_lee_sp,window_pos_lee_vip
0,43815.00000,100,35,509,31.6,0.3,96,8.8,0.0,49.9,...,,0.0,,25.0,26.0,263.0,,10,,1.2
1,43815.00347,100,85,484,31.8,0.3,96,9.2,0.0,48.5,...,,0.0,,25.0,26.0,265.0,,10,,1.2
2,43815.00694,100,96,475,31.8,0.3,96,9.1,0.0,46.8,...,,0.0,,25.0,26.0,265.0,,10,,1.2
3,43815.01042,100,96,501,32.0,0.3,96,8.5,0.0,45.2,...,,0.0,,25.0,26.0,267.0,,10,,1.2
4,43815.01389,100,96,487,32.0,0.3,96,8.5,0.0,43.8,...,,0.0,,25.0,26.0,267.0,,10,,1.2
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
47804,43980.98611,0,0,479,4.7,6.8,0,2.1,0.0,0.0,...,,0.0,18.0,18.0,18.0,39.0,,1440,0.0,0.0
47805,43980.98958,0,0,485,4.7,6.8,0,2.1,0.0,0.0,...,,0.0,18.0,18.0,18.0,39.0,,1440,0.0,0.0
47806,43980.99306,0,0,465,4.7,6.8,0,2.0,0.0,0.0,...,,0.0,18.0,18.0,18.0,39.0,,1440,0.0,0.0
47807,43980.99653,0,0,470,4.7,6.8,0,1.9,0.0,0.0,...,,0.0,18.0,18.0,18.0,39.0,,1440,0.0,0.0


In [14]:
df8

Unnamed: 0,%Time,Stem_elong,Stem_thick,Cum_trusses,stem_dens,plant_dens
0,43823,14.4,7.1,,3.6,1.8
1,43830,27.9,10.5,0.9,3.6,1.8
2,43838,32.8,12.8,2.8,3.6,1.8
3,43845,30.2,12.8,4.0,3.6,1.8
4,43852,30.3,12.5,5.2,4.5,1.8
5,43859,33.6,11.5,6.7,4.5,1.8
6,43866,34.7,11.3,7.8,4.5,1.8
7,43873,39.3,11.3,8.9,5.4,1.8
8,43880,39.3,11.7,10.2,5.4,1.8
9,43887,36.6,10.8,11.4,5.4,1.8


# Data Description and parametric descriptions

df1:     "Weather" dataset:

1. **Time Information:**
   - `%time`: Timestamp or time information corresponding to each entry in the dataset.

2. **Weather Metrics:**
   - `AbsHumOut`: Absolute humidity outside.
   - `Iglob`: Global radiation.
   - `PARout`: Photosynthetically Active Radiation outside.
   - `Pyrgeo`: Pyranometer reading for solar radiation.
   - `RadSum`: Sum of radiation.
   - `Rain`: Rainfall.
   - `Rhout`: Relative humidity outside.
   - `Tout`: Outside temperature.
   - `Winddir`: Wind direction.
   - `Windsp`: Wind speed.

3. **Notes:**
   - The dataset appears to contain various meteorological parameters recorded at different timestamps.
   - Some values are constant (e.g., `Iglob`, `PARout`, `Pyrgeo`, `RadSum`) for the provided snippet.

4. **Time Range:**
   - The time range of the dataset corresponds to specific timestamps.
------------------------------------------------------------------------------------------
df2:       "Tom Quality" dataset:

1. **Time Information:**
   - `%time`: Timestamp or time information corresponding to each entry in the dataset.

2. **Tomato Quality Metrics:**
   - `Flavour`: Flavor rating of the tomato.
   - `TSS` (Total Soluble Solids): Represents the concentration of dissolved sugars, minerals, and acids in the tomato. It is often associated with sweetness.
   - `Acid`: Acidity level in the tomato.
   - `%Juice`: Percentage of juice content in the tomato.
   - `Bite`: Represents the firmness or texture of the tomato.
   - `Weight`: Weight of the tomato.
   - `DMC_fruit`: Dry Matter Content of the tomato fruit.

3. **Notes:**
   - Some entries have missing values (NaN) for the `DMC_fruit` metric.

4. **Time Range:**
   - The time range of the dataset corresponds to specific timestamps.
   
   
------------------------------------------------------------------------------------------
df3:       "Resources" dataset:

1. **Time Information:**
   - `%Time`: Timestamp or time information corresponding to each entry in the dataset.

2. **Resource Consumption Metrics:**
   - `Heat_cons`: Heat consumption values over time.
   - `ElecHigh`: High electrical consumption values.
   - `ElecLow`: Low electrical consumption values.
   - `CO2_cons`: CO2 consumption values.
   - `Irr`: Irrigation values.
   - `Drain`: Drainage values.

3. **Overview:**
   - The dataset provides a time-series record of resource and environmental metrics.
   - Resource metrics include heat consumption, high and low electrical consumption, CO2 consumption, irrigation, and drainage.
   - Values are recorded at different timestamps.

4. **Notes:**
   - The `%Time` column serves as a temporal reference for each dataset entry.
   - Resource values are recorded for various time points, allowing analysis of resource consumption patterns.

------------------------------------------------------------------------------------------
df4:       "Production" dataset:

1. **Time Information:**
   - `%time`: The timestamp or time information for each entry in the dataset.

2. **Production Metrics:**
   - `ProdA` and `ProdB`: Production metrics for two different products.
   - `avg_nr_harvested_trusses`: The average number of harvested trusses.
   - `Truss development time`: Time taken for truss development.
   - `Nr_fruits_ClassA` and `Weight_fruits_ClassA`: Number and weight of fruits for Class A.
   - `Nr_fruits_ClassB` and `Weight_fruits_ClassB`: Number and weight of fruits for Class B.

3. **Resource Consumption:**
   - `Heat_cons`: Heat consumption.
   - `ElecHigh` and `ElecLow`: High and low electricity consumption.
   - `CO2_cons`: Carbon dioxide consumption.
   - `Irr`: Irrigation.
   - `Drain`: Drainage.

4. **Notes:**
   - Some entries have missing values (NaN) for certain fruit-related metrics.
   - Resource consumption metrics include heat, electricity, carbon dioxide, irrigation, and drainage.

5. **Time Range:**
   - The time range of the dataset seems to span multiple days or time periods.

-----------------------------------------------------------------------------
df5:          "Lab Analysis" dataset:
   - **Columns:**
     ```
     %Time, irr_PH, irr_EC, irr_NH4, irr_K, irr_Na, irr_Ca, irr_Mg, irr_Si, irr_NO3, irr_Cl, irr_SO4, irr_HCO3, irr_PO4, irr_Fe, irr_Mn, irr_Zn, irr_B, irr_Cu, irr_Mo, drain_PH, drain_EC, drain_NH4, drain_K, drain_Na, drain_Ca, drain_Mg, drain_Si, drain_NO3, drain_Cl, drain_SO4, drain_HCO3, drain_PO4, drain_Fe, drain_Mn, drain_Zn, drain_B, drain_Cu, drain_Mo
     ```
    ```
   - **Explanation:**
     - `%Time`: Represents a timestamp or time-related information for lab analysis.
     - The other columns represent various parameters like pH, electrical conductivity (EC), concentrations of different elements (NH4, K, Na, Ca, Mg, Si, NO3, Cl, SO4, HCO3, PO4, Fe, Mn, Zn, B, Cu, Mo) in irrigation (irr_) and drainage (drain_) samples.

------------------------------------------------------------------------------------------------------
df6:       "GrodanSense" dataset:

The "GrodanSense" dataset seems to be incomplete or contain missing values, as all the values for the specified parameters (`EC_slab1`, `EC_slab2`, `WC_slab1`, `WC_slab2`, `t_slab1`, `t_slab2`) are marked as NaN (Not a Number) in the provided snippet.

Here's an overview of the parameters in the dataset:

1. **Time Information:**
   - `%time`: Timestamp or time information corresponding to each entry in the dataset.

2. **GrodanSense Metrics:**
   - `EC_slab1`: Electrical conductivity of slab 1.
   - `EC_slab2`: Electrical conductivity of slab 2.
   - `WC_slab1`: Water content of slab 1.
   - `WC_slab2`: Water content of slab 2.
   - `t_slab1`: Temperature of slab 1.
   - `t_slab2`: Temperature of slab 2.

3. **Notes:**
   - The dataset may represent measurements related to the electrical conductivity, water content, and temperature of different slabs. 
------------------------------------------------------------------------------------
df7:        "Crop Parameters" dataset:
   - Columns: %Time, Stem_elong, Stem_thick, Cum_trusses, stem_dens, plant_dens
      - Explanation:
     - `%Time`: Represents a timestamp or time-related information.
     - `Stem_elong`: Represents stem elongation.
     - `Stem_thick`: Represents stem thickness.
     - `Cum_trusses`: Represents cumulative trusses.
     - `stem_dens`: Represents stem density.
     - `plant_dens`: Represents plant density.
------------------------------------------------ 
---------------------------------------------------------------------------------------------------------
df8:       "Greenhouse climate" dataset:

1. **Time Information:**
   - `%time`: Timestamp or time information corresponding to each entry in the dataset.

2. **Greenhouse Climate Metrics:**
   - `AssimLight`: Assimilation light.
   - `BlackScr`: Blackout screen position.
   - `CO2air`: CO2 concentration in the air.
   - `Cum_irr`: Cumulative irrigation.
   - `EC_drain_PC`: Drainage percentage based on electrical conductivity.
   - `EnScr`: Energy screen position.
   - `HumDef`: Humidity deficit.
   - `PipeGrow`: Pipe temperature for growth.
   - `PipeLow`: Pipe temperature for lower part.
   - `Rhair`: Relative humidity in the air.
   - `Tair`: Air temperature.
   - `Tot_PAR`: Total Photosynthetically Active Radiation.
   - `Tot_PAR_Lamps`: Total PAR from lamps.
   - `VentLee`: Ventilation percentage for the left side.
   - `Ventwind`: Ventilation percentage for wind side.
   - ...and many more.

3. **Notes:**
   - The dataset appears to contain various climate-related parameters recorded in a greenhouse environment.
   - Some values are constant (e.g., `assim_sp`, `assim_vip`, `co2_dos`) for the provided snippet.

4. **Time Range:**
   - The time range of the dataset corresponds to specific timestamps.