[//]: # ( Greenhouse Lighting Setup Analyzer )[//]: # ( License: MIT License )# 💡 Greenhouse Lighting Setup Analyzer**Version 1.0** | Created: 2025-11-04[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/outobecca/botanical-colabs/blob/main/notebooks/greenhouse/lighting_setup_analyzer.ipynb)## 📋 Overview**Purpose:** Analyze greenhouse lighting setups based on measured values from CSV data to optimize plant growth conditions.### 🎯 Use Cases- **Light Intensity Analysis:** Measure and analyze PAR (Photosynthetically Active Radiation) levels across greenhouse zones- **Uniformity Assessment:** Evaluate light distribution uniformity to identify hot/cold spots- **Duration Tracking:** Monitor photoperiod and Daily Light Integral (DLI)- **Energy Efficiency:** Calculate lighting efficiency and power consumption- **Recommendations:** Generate actionable insights for lighting optimization### 📊 Expected CSV FormatYour CSV file should contain lighting measurements with columns such as:- `timestamp` or `date_time`: Measurement time- `zone` or `location`: Measurement location in greenhouse- `light_intensity`: Light intensity (PPFD in μmol/m²/s or lux)- `power_consumption`: Power usage (watts) [optional]- Temperature, humidity, or other environmental factors [optional]

## 📚 Background### Greenhouse Lighting RequirementsProper lighting is critical for:- **Photosynthesis:** Plants need 200-400 μmol/m²/s PPFD for optimal growth- **Photoperiod Control:** Day length affects flowering and vegetative growth- **Energy Efficiency:** Lighting can be 30-50% of greenhouse operational costs- **Light Uniformity:** Variation >15% can cause uneven crop development### Key Metrics- **PPFD:** Photosynthetic Photon Flux Density (μmol/m²/s)- **DLI:** Daily Light Integral (mol/m²/day)- **Uniformity Ratio:** Min/Max or Coefficient of Variation- **Efficacy:** Light output per watt (μmol/J)This notebook analyzes measured lighting data to ensure optimal growing conditions.

## ⚙️ Step 1: InstallationInstall required libraries for data analysis and visualization.

In [None]:
!pip install -q pandas numpy matplotlib seaborn plotly scipyimport pandas as pdimport numpy as npimport matplotlib.pyplot as pltimport seaborn as snsimport plotly.graph_objects as goimport plotly.express as pxfrom scipy import statsfrom datetime import datetime, timedeltaimport ioimport warningswarnings.filterwarnings('ignore')sns.set_style("whitegrid")plt.rcParams['figure.figsize'] = (12, 6)print("✅ Analysis libraries loaded successfully")

## 📂 Step 2: Upload CSV DataUpload your lighting measurement CSV file or use the sample data generator.

In [None]:
# Sample data generator for demonstrationdef generate_sample_lighting_data():    """Generate sample greenhouse lighting measurement data"""    np.random.seed(42)        # Create 7 days of hourly measurements across 6 zones    dates = pd.date_range(start='2025-01-01', periods=168, freq='h')    zones = ['Zone A', 'Zone B', 'Zone C', 'Zone D', 'Zone E', 'Zone F']        data = []    for date in dates:        hour = date.hour        # Simulate daylight hours (6 AM - 10 PM = 16 hours photoperiod)        if 6 <= hour < 22:            for zone in zones:                # Base light intensity with zone variation                base_intensity = 300  # Target PPFD                zone_factor = np.random.uniform(0.85, 1.15)  # Zone variation ±15%                time_factor = 1.0 - abs(hour - 14) / 16  # Peak at 2 PM                                intensity = base_intensity * zone_factor * time_factor * np.random.uniform(0.95, 1.05)                power = intensity * 0.5 + np.random.uniform(-5, 5)  # Simulated power consumption                                data.append({                    'timestamp': date,                    'zone': zone,                    'light_intensity_ppfd': round(intensity, 2),                    'power_watts': round(power, 2),                    'temperature_c': round(22 + np.random.uniform(-2, 2), 1),                    'humidity_percent': round(65 + np.random.uniform(-5, 5), 1)                })        df = pd.DataFrame(data)    return df# Try to upload CSV or use sample dataprint("📊 DATA SOURCE SELECTION")print("-" * 50)print("Option 1: Upload your CSV file using the file upload")print("Option 2: Use sample data for demonstration")print()use_sample = input("Use sample data? (yes/no): ").strip().lower()if use_sample == 'yes' or use_sample == 'y':    df = generate_sample_lighting_data()    print("✅ Generated sample data with", len(df), "measurements")    print("   - Time range:", df['timestamp'].min(), "to", df['timestamp'].max())    print("   - Zones:", df['zone'].nunique())else:    # File upload for Colab/Jupyter    try:        from google.colab import files        uploaded = files.upload()        filename = list(uploaded.keys())[0]        df = pd.read_csv(io.BytesIO(uploaded[filename]))        print(f"✅ Loaded {filename} with {len(df)} rows")    except:        print("⚠️  File upload not available. Please load CSV manually:")        print("   df = pd.read_csv('your_file.csv')")        df = generate_sample_lighting_data()        print("   Using sample data instead")# Display first rowsprint("\n📋 Data Preview:")display(df.head(10))print("\n📊 Data Info:")print(df.info())

## ✅ Step 3: Data Validation and PreprocessingValidate data quality and prepare for analysis.

In [None]:
# Data validation and preprocessingprint("🔍 DATA VALIDATION")print("=" * 60)# Identify timestamp columntimestamp_cols = [col for col in df.columns if 'time' in col.lower() or 'date' in col.lower()]if timestamp_cols:    timestamp_col = timestamp_cols[0]    df[timestamp_col] = pd.to_datetime(df[timestamp_col])    print(f"✅ Timestamp column: {timestamp_col}")else:    print("⚠️  No timestamp column found")    timestamp_col = None# Identify zone/location columnzone_cols = [col for col in df.columns if 'zone' in col.lower() or 'location' in col.lower()]zone_col = zone_cols[0] if zone_cols else Noneif zone_col:    print(f"✅ Zone/Location column: {zone_col}")    print(f"   Zones found: {df[zone_col].nunique()} - {list(df[zone_col].unique())}")# Identify intensity columnintensity_cols = [col for col in df.columns if 'light' in col.lower() or 'intensity' in col.lower() or 'ppfd' in col.lower() or 'par' in col.lower()]intensity_col = intensity_cols[0] if intensity_cols else Noneif intensity_col:    print(f"✅ Light intensity column: {intensity_col}")    print(f"   Range: {df[intensity_col].min():.2f} - {df[intensity_col].max():.2f}")# Identify power columnpower_cols = [col for col in df.columns if 'power' in col.lower() or 'watt' in col.lower()]power_col = power_cols[0] if power_cols else Noneif power_col:    print(f"✅ Power column: {power_col}")# Check for missing valuesprint(f"\n📊 Missing Values:")missing = df.isnull().sum()if missing.sum() > 0:    print(missing[missing > 0])else:    print("   None - data is complete!")# Check for outliers in intensityif intensity_col:    Q1 = df[intensity_col].quantile(0.25)    Q3 = df[intensity_col].quantile(0.75)    IQR = Q3 - Q1    outliers = df[(df[intensity_col] < Q1 - 1.5 * IQR) | (df[intensity_col] > Q3 + 1.5 * IQR)]    print(f"\n⚠️  Outliers detected: {len(outliers)} measurements ({len(outliers)/len(df)*100:.1f}%)")print("\n✅ Data validation complete")

## 📊 Step 4: Lighting AnalysisAnalyze light intensity, uniformity, duration, and efficiency.

In [None]:
# Comprehensive lighting analysisprint("💡 LIGHTING ANALYSIS")print("=" * 60)# 1. Overall Statisticsprint("\n1️⃣ OVERALL LIGHT INTENSITY STATISTICS")print("-" * 60)if intensity_col:    stats_dict = {        'Mean': df[intensity_col].mean(),        'Median': df[intensity_col].median(),        'Std Dev': df[intensity_col].std(),        'Min': df[intensity_col].min(),        'Max': df[intensity_col].max(),        'CV (%)': (df[intensity_col].std() / df[intensity_col].mean() * 100)    }    for key, value in stats_dict.items():        print(f"{key:20s}: {value:8.2f}")# 2. Zone Analysisif zone_col and intensity_col:    print("\n2️⃣ ZONE-WISE ANALYSIS")    print("-" * 60)    zone_stats = df.groupby(zone_col)[intensity_col].agg(['mean', 'std', 'min', 'max', 'count'])    zone_stats['cv_%'] = (zone_stats['std'] / zone_stats['mean'] * 100)    print(zone_stats.round(2))        # Uniformity ratio    zone_means = df.groupby(zone_col)[intensity_col].mean()    uniformity_ratio = zone_means.min() / zone_means.max()    print(f"\n   Uniformity Ratio (min/max): {uniformity_ratio:.3f}")    if uniformity_ratio >= 0.85:        print(f"   ✅ Excellent uniformity (≥0.85)")    elif uniformity_ratio >= 0.70:        print(f"   ⚠️  Acceptable uniformity (0.70-0.85)")    else:        print(f"   ❌ Poor uniformity (<0.70) - adjustment needed")# 3. Daily Light Integral (DLI)if timestamp_col and intensity_col:    print("\n3️⃣ DAILY LIGHT INTEGRAL (DLI)")    print("-" * 60)    # DLI = PPFD (μmol/m²/s) × photoperiod (hours) × 3600 / 1,000,000    df['date'] = df[timestamp_col].dt.date    df['hour'] = df[timestamp_col].dt.hour        # Calculate photoperiod (hours with light > threshold)    light_threshold = 50  # μmol/m²/s    daily_stats = df.groupby('date').agg({        intensity_col: 'mean',        'hour': lambda x: (df.loc[x.index, intensity_col] > light_threshold).sum()    })    daily_stats.columns = ['avg_ppfd', 'photoperiod_hours']        # Calculate DLI    daily_stats['dli_mol_m2_day'] = (daily_stats['avg_ppfd'] * daily_stats['photoperiod_hours'] * 3600 / 1_000_000)        print(daily_stats.round(2))    print(f"\n   Average DLI: {daily_stats['dli_mol_m2_day'].mean():.2f} mol/m²/day")    print(f"   Average Photoperiod: {daily_stats['photoperiod_hours'].mean():.1f} hours")        avg_dli = daily_stats['dli_mol_m2_day'].mean()    if avg_dli >= 12:        print(f"   ✅ Sufficient DLI for most crops (≥12 mol/m²/day)")    elif avg_dli >= 6:        print(f"   ⚠️  Low DLI (6-12 mol/m²/day) - may limit growth")    else:        print(f"   ❌ Insufficient DLI (<6 mol/m²/day) - increase needed")# 4. Energy Efficiencyif power_col and intensity_col:    print("\n4️⃣ ENERGY EFFICIENCY")    print("-" * 60)    # Calculate efficacy (μmol/J = PPFD / Power)    df['efficacy_umol_j'] = df[intensity_col] / df[power_col]    avg_efficacy = df['efficacy_umol_j'].mean()    total_power = df[power_col].sum() / len(df) * df.groupby(zone_col).ngroups if zone_col else df[power_col].mean()        print(f"   Average Efficacy: {avg_efficacy:.2f} μmol/J")    print(f"   Average Power Consumption: {total_power:.2f} W")        if avg_efficacy >= 2.5:        print(f"   ✅ Excellent efficiency (LED quality)")    elif avg_efficacy >= 1.5:        print(f"   ⚠️  Good efficiency (HPS/modern HID)")    else:        print(f"   ❌ Low efficiency - consider upgrade")print("\n✅ Analysis complete")

## 📈 Step 5: VisualizationsCreate comprehensive visualizations of lighting data.

In [None]:
# Create visualizationsprint("📊 GENERATING VISUALIZATIONS")print("=" * 60)fig, axes = plt.subplots(2, 2, figsize=(16, 12))# 1. Light Intensity Distributionif intensity_col:    axes[0, 0].hist(df[intensity_col], bins=30, edgecolor='black', alpha=0.7, color='gold')    axes[0, 0].axvline(df[intensity_col].mean(), color='red', linestyle='--', linewidth=2, label=f'Mean: {df[intensity_col].mean():.1f}')    axes[0, 0].axvline(df[intensity_col].median(), color='blue', linestyle='--', linewidth=2, label=f'Median: {df[intensity_col].median():.1f}')    axes[0, 0].set_xlabel('Light Intensity (μmol/m²/s)', fontsize=12)    axes[0, 0].set_ylabel('Frequency', fontsize=12)    axes[0, 0].set_title('Light Intensity Distribution', fontsize=14, fontweight='bold')    axes[0, 0].legend()    axes[0, 0].grid(True, alpha=0.3)# 2. Zone Comparisonif zone_col and intensity_col:    zone_data = [df[df[zone_col] == zone][intensity_col].values for zone in df[zone_col].unique()]    bp = axes[0, 1].boxplot(zone_data, labels=df[zone_col].unique(), patch_artist=True)    for patch in bp['boxes']:        patch.set_facecolor('lightblue')    axes[0, 1].set_xlabel('Zone', fontsize=12)    axes[0, 1].set_ylabel('Light Intensity (μmol/m²/s)', fontsize=12)    axes[0, 1].set_title('Light Intensity by Zone', fontsize=14, fontweight='bold')    axes[0, 1].grid(True, alpha=0.3, axis='y')    axes[0, 1].tick_params(axis='x', rotation=45)# 3. Time Seriesif timestamp_col and intensity_col:    # Average by hour for clarity    hourly_avg = df.groupby(df[timestamp_col].dt.floor('H'))[intensity_col].mean()    axes[1, 0].plot(hourly_avg.index, hourly_avg.values, linewidth=2, color='orange', marker='o', markersize=3)    axes[1, 0].set_xlabel('Time', fontsize=12)    axes[1, 0].set_ylabel('Avg Light Intensity (μmol/m²/s)', fontsize=12)    axes[1, 0].set_title('Light Intensity Over Time', fontsize=14, fontweight='bold')    axes[1, 0].grid(True, alpha=0.3)    axes[1, 0].tick_params(axis='x', rotation=45)# 4. Power vs Intensityif power_col and intensity_col:    axes[1, 1].scatter(df[power_col], df[intensity_col], alpha=0.5, s=20, color='green')    # Add trend line    z = np.polyfit(df[power_col], df[intensity_col], 1)    p = np.poly1d(z)    axes[1, 1].plot(df[power_col].sort_values(), p(df[power_col].sort_values()), "r--", linewidth=2, label=f'Trend: y={z[0]:.2f}x+{z[1]:.2f}')    axes[1, 1].set_xlabel('Power (W)', fontsize=12)    axes[1, 1].set_ylabel('Light Intensity (μmol/m²/s)', fontsize=12)    axes[1, 1].set_title('Power vs Light Intensity', fontsize=14, fontweight='bold')    axes[1, 1].legend()    axes[1, 1].grid(True, alpha=0.3)plt.tight_layout()plt.savefig('lighting_analysis.png', dpi=300, bbox_inches='tight')print("✅ Saved: lighting_analysis.png")plt.show()# Interactive heatmap if zone and time data availableif zone_col and timestamp_col and intensity_col:    print("\n📊 Creating interactive heatmap...")    # Pivot data for heatmap    df['hour'] = df[timestamp_col].dt.hour    df['date'] = df[timestamp_col].dt.date        # Average intensity by zone and hour    pivot_data = df.pivot_table(values=intensity_col, index=zone_col, columns='hour', aggfunc='mean')        fig_heat = go.Figure(data=go.Heatmap(        z=pivot_data.values,        x=pivot_data.columns,        y=pivot_data.index,        colorscale='YlOrRd',        colorbar=dict(title='PPFD (μmol/m²/s)')    ))        fig_heat.update_layout(        title='Light Intensity Heatmap: Zone vs Hour of Day',        xaxis_title='Hour of Day',        yaxis_title='Zone',        width=900,        height=500    )        fig_heat.show()    print("✅ Interactive heatmap displayed")print("\n✅ All visualizations complete")

## 💡 Step 6: RecommendationsGenerate actionable recommendations based on analysis.

In [None]:
# Generate recommendationsprint("💡 LIGHTING SETUP RECOMMENDATIONS")print("=" * 60)recommendations = []# Check intensity levelsif intensity_col:    avg_intensity = df[intensity_col].mean()        if avg_intensity < 200:        recommendations.append({            'priority': 'HIGH',            'issue': 'Low average light intensity',            'detail': f'Average PPFD is {avg_intensity:.1f} μmol/m²/s (target: 250-400)',            'action': 'Increase light output or reduce fixture height'        })    elif avg_intensity > 600:        recommendations.append({            'priority': 'MEDIUM',            'issue': 'High light intensity',            'detail': f'Average PPFD is {avg_intensity:.1f} μmol/m²/s (may cause stress)',            'action': 'Consider dimming or raising fixtures to prevent photoinhibition'        })    else:        recommendations.append({            'priority': 'INFO',            'issue': 'Appropriate light intensity',            'detail': f'Average PPFD is {avg_intensity:.1f} μmol/m²/s',            'action': 'Maintain current settings'        })# Check uniformityif zone_col and intensity_col:    zone_means = df.groupby(zone_col)[intensity_col].mean()    uniformity_ratio = zone_means.min() / zone_means.max()        if uniformity_ratio < 0.70:        recommendations.append({            'priority': 'HIGH',            'issue': 'Poor light uniformity',            'detail': f'Uniformity ratio is {uniformity_ratio:.3f} (target: >0.85)',            'action': 'Reposition fixtures, add supplemental lights, or adjust fixture spacing'        })    elif uniformity_ratio < 0.85:        recommendations.append({            'priority': 'MEDIUM',            'issue': 'Suboptimal uniformity',            'detail': f'Uniformity ratio is {uniformity_ratio:.3f}',            'action': 'Minor adjustments to fixture positions recommended'        })# Check DLIif timestamp_col and intensity_col:    df['date'] = df[timestamp_col].dt.date    light_threshold = 50    daily_stats = df.groupby('date').agg({        intensity_col: 'mean',        'hour': lambda x: (df.loc[x.index, intensity_col] > light_threshold).sum()    })    daily_stats.columns = ['avg_ppfd', 'photoperiod_hours']    daily_stats['dli_mol_m2_day'] = (daily_stats['avg_ppfd'] * daily_stats['photoperiod_hours'] * 3600 / 1_000_000)    avg_dli = daily_stats['dli_mol_m2_day'].mean()        if avg_dli < 6:        recommendations.append({            'priority': 'HIGH',            'issue': 'Insufficient Daily Light Integral',            'detail': f'DLI is {avg_dli:.1f} mol/m²/day (target: 12-20 for most crops)',            'action': 'Extend photoperiod or increase light intensity'        })    elif avg_dli < 12:        recommendations.append({            'priority': 'MEDIUM',            'issue': 'Low Daily Light Integral',            'detail': f'DLI is {avg_dli:.1f} mol/m²/day',            'action': 'Consider increasing photoperiod or intensity for better growth'        })# Check efficiencyif power_col and intensity_col:    df['efficacy_umol_j'] = df[intensity_col] / df[power_col]    avg_efficacy = df['efficacy_umol_j'].mean()        if avg_efficacy < 1.5:        recommendations.append({            'priority': 'MEDIUM',            'issue': 'Low lighting efficiency',            'detail': f'Efficacy is {avg_efficacy:.2f} μmol/J (modern LEDs: 2.5-3.0)',            'action': 'Consider upgrading to high-efficiency LED fixtures'        })# Display recommendationsfor i, rec in enumerate(recommendations, 1):    print(f"\n{i}. [{rec['priority']}] {rec['issue']}")    print(f"   📊 {rec['detail']}")    print(f"   💡 {rec['action']}")if not recommendations:    print("\n✅ No issues detected - lighting setup is optimal!")print("\n" + "=" * 60)print("✅ Recommendations generated")

## 💾 Step 7: Export ResultsSave analysis results and visualizations.

In [None]:
# Export resultsprint("💾 EXPORTING RESULTS")print("=" * 60)timestamp_str = datetime.now().strftime("%Y%m%d_%H%M%S")# 1. Save summary statistics to CSVif intensity_col:    summary_stats = pd.DataFrame({        'Metric': ['Mean PPFD', 'Median PPFD', 'Std Dev', 'Min', 'Max', 'CV (%)'],        'Value': [            df[intensity_col].mean(),            df[intensity_col].median(),            df[intensity_col].std(),            df[intensity_col].min(),            df[intensity_col].max(),            (df[intensity_col].std() / df[intensity_col].mean() * 100)        ]    })        filename = f'lighting_summary_{timestamp_str}.csv'    summary_stats.to_csv(filename, index=False)    print(f"✅ Saved summary statistics: {filename}")# 2. Save zone statistics if availableif zone_col and intensity_col:    zone_stats = df.groupby(zone_col)[intensity_col].agg(['mean', 'std', 'min', 'max', 'count'])    zone_stats['cv_%'] = (zone_stats['std'] / zone_stats['mean'] * 100)        filename = f'zone_analysis_{timestamp_str}.csv'    zone_stats.to_csv(filename)    print(f"✅ Saved zone analysis: {filename}")# 3. Already saved visualizationprint(f"✅ Visualization saved: lighting_analysis.png")print("\n" + "=" * 60)print("📁 All results exported successfully")print("\n📚 CITATION")print("-" * 60)print("If using this analysis in research or reports:")print("- Tool: Greenhouse Lighting Setup Analyzer")print("- Repository: https://github.com/outobecca/botanical-colabs")print("- Date:", datetime.now().strftime("%Y-%m-%d"))

## 📚 Summary and Next Steps### What This Notebook Does✅ Loads and validates CSV lighting measurement data  ✅ Analyzes light intensity, uniformity, and distribution  ✅ Calculates Daily Light Integral (DLI)  ✅ Evaluates energy efficiency  ✅ Generates visualizations and recommendations  ✅ Exports results for reporting### Interpretation Guide- **PPFD (μmol/m²/s):** Target 250-400 for most greenhouse crops- **DLI (mol/m²/day):** Target 12-20 for optimal growth- **Uniformity Ratio:** >0.85 is excellent, >0.70 is acceptable- **Efficacy (μmol/J):** >2.5 for high-efficiency LEDs### Next Steps1. Review recommendations and adjust lighting setup2. Re-measure after adjustments to verify improvements3. Monitor DLI across seasons and adjust photoperiod4. Consider automating measurements with sensors5. Track energy costs and calculate ROI for upgrades### Resources- [Greenhouse Lighting Guide](https://www.extension.iastate.edu/greenhouse/production/greenhouse-lighting)- [DLI Requirements by Crop](https://www.canr.msu.edu/uploads/resources/pdfs/dli-requirements-by-crop.pdf)- [LED Lighting Best Practices](https://www.pnas.org/doi/10.1073/pnas.2110757118)---**License:** MIT License  **Repository:** https://github.com/outobecca/botanical-colabs  **Version:** 1.0 | Created: 2025-11-04