# Ayurvedic Medicine Sensor Analysis Pipeline

This notebook demonstrates the complete machine learning pipeline for analyzing NIR sensor data from Ayurvedic medicines. We'll cover:

1. Data loading and validation
2. Exploratory data analysis
3. Feature engineering
4. Model development and training
5. Evaluation and visualization
6. Production deployment

## Setup Requirements
- Python 3.8+
- Required packages: pandas, numpy, scikit-learn, tensorflow, plotly, fastapi
- Git repository for version control

In [8]:
# Import required libraries
import numpy as np
import pandas as pd
import plotly.express as px
import plotly.graph_objects as go
from plotly.subplots import make_subplots
from sklearn.preprocessing import StandardScaler, MinMaxScaler
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestRegressor, RandomForestClassifier
from sklearn.svm import SVR, SVC
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Dropout
from tensorflow.keras.optimizers import Adam
import joblib
import json
import os

# Import custom modules
import sys
sys.path.append('..')
from src.data.preprocessing import DataPreprocessor
from src.features.engineering import FeatureEngineer
from src.models.ensemble import EnsembleModel
from src.visualization.plots import Visualizer

# Set random seed for reproducibility
np.random.seed(42)
tf.random.set_seed(42)

# Data Loading and Validation

Let's create a sample dataset and demonstrate the data loading and validation pipeline. Our dataset should have the following structure:
- 6 NIR wavelength sensors (R,S,T,U,V,W)
- Temperature readings
- Dilution percentages (100%, 75%, 50%, 25%, 10%)
- Medicine names (3-4 different medicines)
- Effectiveness scores
- Reading IDs

In [3]:
# Generate sample data with taste profiles
np.random.seed(42)

# Define parameters
n_samples_per_dilution = 25
dilution_levels = [100, 75, 50, 25, 10]
medicines = {
    'Ashwagandha': {
        'primary_taste': 'bitter',  # Tikta
        'secondary_taste': 'astringent'  # Kashaya
    },
    'Turmeric': {
        'primary_taste': 'bitter',  # Tikta
        'secondary_taste': 'pungent'  # Katu
    },
    'Tulsi': {
        'primary_taste': 'pungent',  # Katu
        'secondary_taste': 'bitter'  # Tikta
    },
    'Neem': {
        'primary_taste': 'bitter',  # Tikta
        'secondary_taste': 'astringent'  # Kashaya
    }
}

# Taste profiles baseline sensor patterns
taste_profiles = {
    'sweet': {'R': 1.2, 'S': 1.1, 'T': 0.9, 'U': 0.8, 'V': 0.7, 'W': 0.6},  # Madhura
    'sour': {'R': 0.8, 'S': 1.0, 'T': 1.2, 'U': 1.1, 'V': 0.9, 'W': 0.7},   # Amla
    'salty': {'R': 0.9, 'S': 1.1, 'T': 1.0, 'U': 0.9, 'V': 0.8, 'W': 0.9},  # Lavana
    'bitter': {'R': 0.7, 'S': 0.8, 'T': 1.1, 'U': 1.2, 'V': 1.0, 'W': 0.8}, # Tikta
    'pungent': {'R': 1.0, 'S': 0.9, 'T': 0.8, 'U': 1.0, 'V': 1.1, 'W': 1.0},# Katu
    'astringent': {'R': 0.6, 'S': 0.7, 'T': 0.9, 'U': 1.0, 'V': 1.2, 'W': 1.1} # Kashaya
}

# Generate synthetic data
data = []
reading_id = 1

for medicine, tastes in medicines.items():
    primary_profile = taste_profiles[tastes['primary_taste']]
    secondary_profile = taste_profiles[tastes['secondary_taste']]
    
    for dilution in dilution_levels:
        for _ in range(n_samples_per_dilution):
            # Combine primary and secondary taste profiles
            base_readings = {}
            for channel in ['R', 'S', 'T', 'U', 'V', 'W']:
                base_value = (0.7 * primary_profile[channel] + 
                            0.3 * secondary_profile[channel])
                # Scale by dilution and add noise
                base_readings[channel] = (base_value * dilution/100 * 
                                        np.random.normal(1, 0.05))
            
            # Effectiveness score (correlated with dilution and some noise)
            effectiveness = (dilution/100) * np.random.normal(1, 0.1)
            
            # Ensure non-negative values and clip to reasonable range
            for channel in base_readings:
                base_readings[channel] = np.clip(base_readings[channel], 0.1, 5.0)
            
            data.append({
                'R': base_readings['R'],
                'S': base_readings['S'],
                'T': base_readings['T'],
                'U': base_readings['U'],
                'V': base_readings['V'],
                'W': base_readings['W'],
                'Dilution_Percent': dilution,
                'Medicine_Name': medicine,
                'Primary_Taste': tastes['primary_taste'],
                'Secondary_Taste': tastes['secondary_taste'],
                'Effectiveness_Score': effectiveness,
                'Reading_ID': reading_id
            })
            reading_id += 1

# Create DataFrame
df = pd.DataFrame(data)

# Display the first 20 rows with specific columns
print("\nSynthetic Data Sample:")
pd.set_option('display.max_columns', None)
pd.set_option('display.width', None)
print(df.head(20).to_string())

# Save to CSV
os.makedirs('../data', exist_ok=True)
df.to_csv('../data/sensor_readings_with_tastes.csv', index=False)
print("\nData saved to '../data/sensor_readings_with_tastes.csv'")


Synthetic Data Sample:
           R         S         T         U         V         W  Dilution_Percent Medicine_Name Primary_Taste Secondary_Taste  Effectiveness_Score  Reading_ID
0   0.686640  0.764677  1.073680  1.226813  1.047590  0.879581               100   Ashwagandha        bitter      astringent             1.157921           1
1   0.695709  0.751925  1.068213  1.113585  1.035316  0.900767               100   Ashwagandha        bitter      astringent             0.808672           2
2   0.612215  0.748352  0.987333  1.157912  1.011875  0.827152               100   Ashwagandha        bitter      astringent             1.146565           3
3   0.662436  0.772600  0.965913  1.108970  1.065879  0.838781               100   Ashwagandha        bitter      astringent             1.037570           4
4   0.649879  0.758770  1.008711  1.245580  1.059285  0.842932               100   Ashwagandha        bitter      astringent             1.082254           5
5   0.629102  0.778041  0.93

## Synthetic Data Generation with Taste Profiles

Let's create synthetic data that includes:
1. NIR sensor readings (R,S,T,U,V,W)
2. Dilution levels
3. Medicine names
4. Effectiveness scores
5. Reading IDs
6. Taste profiles (Madhura, Amla, Lavana, Tikta, Katu, Kashaya)

In [2]:
# Import required libraries
import numpy as np
import pandas as pd
import os

# Data Preprocessing and Feature Engineering

Now let's preprocess the data and create engineered features using our custom modules. We'll:
1. Initialize the data preprocessor
2. Load and validate the data
3. Apply temperature compensation
4. Normalize features
5. Engineer additional features using wavelength ratios and spectral derivatives
6. Apply PCA for dimensionality reduction

In [10]:
# Initialize processors
preprocessor = DataPreprocessor()
feature_engineer = FeatureEngineer()

# Load and preprocess data
df = preprocessor.load_data('../data/sensor_readings.csv')
df_comp = preprocessor.temperature_compensation(df)
df_norm = preprocessor.normalize_features(df_comp)

# Prepare feature matrix
X = np.hstack([
    df_norm[preprocessor.sensor_columns].values,
    df_norm[[preprocessor.temp_column]].values
])

# Engineer features
features = feature_engineer.engineer_features(
    df_norm[preprocessor.sensor_columns].values,
    df_norm[preprocessor.temp_column].values
)

print("Original features shape:", X.shape)
print("Engineered features shape:", features['combined'].shape)
print("PCA features shape:", features['pca'].shape)

Original features shape: (500, 7)
Engineered features shape: (500, 39)
PCA features shape: (500, 3)


# Data Visualization

Let's create visualizations to understand our data better:
1. Sensor reading distributions
2. Temperature vs sensor response
3. Dilution level effects
4. Medicine type patterns
5. Feature correlations

In [None]:
# Visualization of sensor readings
import plotly.express as px
import plotly.graph_objects as go
from plotly.subplots import make_subplots

# Create subplots for all sensor readings
fig = make_subplots(rows=2, cols=3, 
                    subplot_titles=('R Channel', 'S Channel', 'T Channel',
                                  'U Channel', 'V Channel', 'W Channel'))

channels = ['R', 'S', 'T', 'U', 'V', 'W']
row_col = [(1,1), (1,2), (1,3), (2,1), (2,2), (2,3)]

for (channel, (row, col)) in zip(channels, row_col):
    for medicine in medicines:
        med_data = df[df['Medicine_Name'] == medicine]
        fig.add_trace(
            go.Scatter(x=med_data['Dilution_Percent'], 
                      y=med_data[channel],
                      name=f'{medicine} - {channel}',
                      mode='markers',
                      marker=dict(size=8),
                      showlegend=True if row == 1 and col == 1 else False),
            row=row, col=col
        )

fig.update_layout(height=800, width=1200, title_text="Sensor Readings vs Dilution by Medicine Type")
fig.update_xaxes(title_text="Dilution (%)")
fig.update_yaxes(title_text="Sensor Reading")
fig.show()

# Create effectiveness heatmap
df_pivot = df.pivot_table(values='Effectiveness_Score', 
                         index='Medicine_Name', 
                         columns='Dilution_Percent',
                         aggfunc='mean')

fig_heatmap = go.Figure(data=go.Heatmap(
    z=df_pivot.values,
    x=df_pivot.columns,
    y=df_pivot.index,
    colorscale='RdBu',
    colorbar=dict(title='Effectiveness Score')
))

fig_heatmap.update_layout(
    title='Effectiveness Score by Medicine and Dilution',
    xaxis_title='Dilution (%)',
    yaxis_title='Medicine',
    width=1000,
    height=500
)
fig_heatmap.show()

ValueError: Mime type rendering requires nbformat>=4.2.0 but it is not installed

## Taste Profile and Dilution Analysis Visualizations

Let's analyze the relationships between:
1. Taste profiles and sensor patterns
2. Dilution effects on different medicines
3. Primary vs Secondary taste influences
4. Effectiveness across dilution levels
5. Sensor pattern fingerprints for each medicine

In [5]:
# Import visualization libraries
import plotly.graph_objects as go
import plotly.express as px
from plotly.subplots import make_subplots
from sklearn.decomposition import PCA

In [7]:
# 1. Taste Profile Radar Charts
fig = go.Figure()

# Create radar chart for each taste profile
for taste, profile in taste_profiles.items():
    fig.add_trace(go.Scatterpolar(
        r=[profile[ch] for ch in ['R', 'S', 'T', 'U', 'V', 'W']],
        theta=['R', 'S', 'T', 'U', 'V', 'W'],
        name=taste.capitalize(),
        fill='toself'
    ))

fig.update_layout(
    polar=dict(radialaxis=dict(range=[0, 1.5])),
    title='Taste Profile Sensor Patterns',
    showlegend=True,
    width=800,
    height=600
)
fig.show()

# 2. Medicine Sensor Patterns by Dilution
fig = make_subplots(rows=2, cols=2, 
                    subplot_titles=list(medicines.keys()),
                    specs=[[{'type': 'polar'}]*2]*2)

row_col = [(1,1), (1,2), (2,1), (2,2)]
for (medicine, tastes), (row, col) in zip(medicines.items(), row_col):
    for dilution in dilution_levels:
        med_data = df[(df['Medicine_Name'] == medicine) & 
                     (df['Dilution_Percent'] == dilution)][['R', 'S', 'T', 'U', 'V', 'W']].mean()
        fig.add_trace(
            go.Scatterpolar(
                r=med_data.values,
                theta=['R', 'S', 'T', 'U', 'V', 'W'],
                name=f'{dilution}%',
                showlegend=(row == 1 and col == 1)
            ),
            row=row, col=col
        )

fig.update_layout(
    height=800, width=1000,
    title_text="Medicine Sensor Patterns at Different Dilutions"
)
fig.show()

# 3. Effectiveness vs Dilution by Medicine
effectiveness_by_dilution = df.groupby(['Medicine_Name', 'Dilution_Percent'])['Effectiveness_Score'].mean().reset_index()
fig = px.line(effectiveness_by_dilution,
              x='Dilution_Percent',
              y='Effectiveness_Score',
              color='Medicine_Name',
              title='Medicine Effectiveness vs Dilution',
              labels={'Dilution_Percent': 'Dilution (%)',
                     'Effectiveness_Score': 'Effectiveness Score'},
              markers=True)
fig.show()

# 4. Taste Profile Distribution
taste_counts = pd.concat([
    df['Primary_Taste'].value_counts().rename('Primary'),
    df['Secondary_Taste'].value_counts().rename('Secondary')
], axis=1).fillna(0)

fig = go.Figure(data=[
    go.Bar(name='Primary Taste', x=taste_counts.index, y=taste_counts['Primary']),
    go.Bar(name='Secondary Taste', x=taste_counts.index, y=taste_counts['Secondary'])
])
fig.update_layout(
    barmode='group',
    title='Distribution of Primary and Secondary Tastes',
    xaxis_title='Taste',
    yaxis_title='Count'
)
fig.show()

# 5. Sensor Response Heatmap
sensor_corr = df[['R', 'S', 'T', 'U', 'V', 'W']].corr()
fig = px.imshow(sensor_corr,
                labels=dict(x="Sensor", y="Sensor", color="Correlation"),
                x=sensor_corr.columns,
                y=sensor_corr.columns,
                title="Sensor Cross-Correlation Matrix",
                color_continuous_scale='RdBu',
                width=700,
                height=700)
fig.show()

# 6. 3D Scatter Plot of Principal Components by Medicine and Dilution
# Prepare data for PCA
X = df[['R', 'S', 'T', 'U', 'V', 'W']].values
pca = PCA(n_components=3)
X_pca = pca.fit_transform(X)

fig = go.Figure()

for medicine in medicines:
    mask = df['Medicine_Name'] == medicine
    fig.add_trace(go.Scatter3d(
        x=X_pca[mask, 0],
        y=X_pca[mask, 1],
        z=X_pca[mask, 2],
        mode='markers',
        name=medicine,
        marker=dict(
            size=5,
            color=df[mask]['Dilution_Percent'],
            colorscale='Viridis',
            showscale=True if medicine == list(medicines.keys())[0] else False
        ),
        text=[f"Dilution: {d}%" for d in df[mask]['Dilution_Percent']]
    ))

fig.update_layout(
    title='3D PCA Visualization of Medicines and Dilutions',
    scene=dict(
        xaxis_title='PC1',
        yaxis_title='PC2',
        zaxis_title='PC3'
    ),
    width=1000,
    height=800
)
fig.show()

# Print explained variance ratio
print("\nPCA Explained Variance Ratio:")
print(pd.DataFrame({
    'Principal Component': [f'PC{i+1}' for i in range(3)],
    'Explained Variance (%)': pca.explained_variance_ratio_ * 100
}).to_string(index=False))


PCA Explained Variance Ratio:
Principal Component  Explained Variance (%)
                PC1               98.157675
                PC2                1.187483
                PC3                0.238301


## Detailed Statistical Analysis

Let's analyze:
1. Statistical summaries by taste profile and dilution
2. Channel sensitivity analysis
3. Taste-specific correlations
4. Dilution impact analysis
5. Effect size calculations

In [8]:
# 1. Statistical Summary by Taste Profile
taste_stats = pd.DataFrame()
channels = ['R', 'S', 'T', 'U', 'V', 'W']

# Calculate statistics for primary tastes
for taste in df['Primary_Taste'].unique():
    stats = df[df['Primary_Taste'] == taste][channels].agg(['mean', 'std', 'min', 'max'])
    stats.columns = pd.MultiIndex.from_product([[taste], stats.columns])
    taste_stats = pd.concat([taste_stats, stats], axis=1)

print("Channel Statistics by Primary Taste:")
print(taste_stats.round(3))

# 2. Channel Sensitivity Analysis
sensitivity_data = []
for channel in channels:
    for dilution in dilution_levels:
        sensitivity = df[df['Dilution_Percent'] == dilution][channel].std() / \
                     df[df['Dilution_Percent'] == dilution][channel].mean()
        sensitivity_data.append({
            'Channel': channel,
            'Dilution': dilution,
            'Sensitivity': sensitivity
        })

sensitivity_df = pd.DataFrame(sensitivity_data)
fig = px.line(sensitivity_df, x='Dilution', y='Sensitivity', 
              color='Channel', title='Channel Sensitivity vs Dilution',
              labels={'Dilution': 'Dilution (%)', 'Sensitivity': 'Coefficient of Variation'})
fig.show()

# 3. Taste-Specific Correlations
def plot_taste_correlations(taste):
    data = df[df['Primary_Taste'] == taste][channels]
    corr = data.corr()
    
    fig = px.imshow(corr, 
                    title=f'Channel Correlations for {taste.capitalize()} Taste',
                    labels=dict(x="Channel", y="Channel", color="Correlation"),
                    color_continuous_scale='RdBu')
    fig.show()
    return corr

taste_correlations = {}
for taste in df['Primary_Taste'].unique():
    print(f"\nCorrelations for {taste.capitalize()} Taste:")
    taste_correlations[taste] = plot_taste_correlations(taste)

# 4. Dilution Impact Analysis
def calculate_dilution_impact(group):
    base_readings = group[channels].iloc[0]  # 100% concentration
    impacts = []
    
    for _, row in group.iterrows():
        impact = np.mean(np.abs(row[channels] - base_readings) / base_readings)
        impacts.append(impact)
    
    return np.mean(impacts)

dilution_impacts = df.groupby(['Medicine_Name', 'Dilution_Percent']).apply(calculate_dilution_impact)
dilution_impacts = dilution_impacts.reset_index()
dilution_impacts.columns = ['Medicine_Name', 'Dilution_Percent', 'Impact']

fig = px.line(dilution_impacts, x='Dilution_Percent', y='Impact',
              color='Medicine_Name', title='Average Impact of Dilution by Medicine',
              labels={'Impact': 'Relative Change from Base Concentration'})
fig.show()

# 5. Effect Size Analysis
from scipy import stats

effect_sizes = pd.DataFrame()
for channel in channels:
    effects = {}
    for medicine in medicines:
        # Calculate Cohen's d for each dilution level
        base_group = df[(df['Medicine_Name'] == medicine) & 
                       (df['Dilution_Percent'] == 100)][channel]
        
        for dilution in dilution_levels[1:]:  # Skip 100%
            test_group = df[(df['Medicine_Name'] == medicine) & 
                          (df['Dilution_Percent'] == dilution)][channel]
            
            # Calculate Cohen's d
            d = (base_group.mean() - test_group.mean()) / \
                np.sqrt((base_group.var() + test_group.var()) / 2)
            
            effects[f'{medicine}_{dilution}%'] = d
    
    effect_sizes[channel] = pd.Series(effects)

print("\nEffect Sizes (Cohen's d) by Channel:")
print(effect_sizes.round(3))

# 6. Summary Statistics
print("\nKey Findings:")
print("-" * 50)

# Most sensitive channel
sensitivity_summary = sensitivity_df.groupby('Channel')['Sensitivity'].mean()
print(f"Most sensitive channel: {sensitivity_summary.idxmax()} (avg CV: {sensitivity_summary.max():.3f})")

# Strongest correlations
strongest_corr = pd.DataFrame()
for taste, corr in taste_correlations.items():
    mask = np.triu(np.ones_like(corr), k=1).astype(bool)
    strongest = corr.abs().where(mask).max().max()
    strongest_corr[taste] = [strongest]
print(f"Strongest taste-specific correlation: {strongest_corr.iloc[0].idxmax()} ({strongest_corr.iloc[0].max():.3f})")

# Most impacted medicine
most_impacted = dilution_impacts[dilution_impacts['Dilution_Percent'] == dilution_levels[-1]]
print(f"Most impacted by dilution: {most_impacted.iloc[most_impacted['Impact'].idxmax()]['Medicine_Name']}")

# Largest effect size
max_effect = effect_sizes.abs().max().max()
max_effect_channel = effect_sizes.abs().max().idxmax()
print(f"Largest dilution effect: {max_effect_channel} (d = {max_effect:.3f})")

Channel Statistics by Primary Taste:
     bitter                                    pungent                       \
          R      S      T      U      V      W       R      S      T      U   
mean  0.376  0.417  0.535  0.595  0.546  0.459   0.474  0.454  0.464  0.549   
std   0.229  0.256  0.337  0.376  0.345  0.286   0.296  0.282  0.287  0.345   
min   0.100  0.100  0.100  0.100  0.100  0.100   0.100  0.100  0.100  0.100   
max   0.864  0.919  1.137  1.246  1.165  0.969   1.012  0.953  0.987  1.154   

                    
          V      W  
mean  0.553  0.493  
std   0.349  0.312  
min   0.100  0.100  
max   1.142  1.016  



Correlations for Bitter Taste:



Correlations for Pungent Taste:







Effect Sizes (Cohen's d) by Channel:
                      R       S       T       U       V       W
Ashwagandha_75%   6.407   6.544   5.424   5.884   5.520   6.287
Ashwagandha_50%  14.610  13.754  11.105  14.575  13.861  13.701
Ashwagandha_25%  24.718  22.646  18.826  24.203  24.673  22.599
Ashwagandha_10%  29.112  27.426  23.104  29.897  30.807  27.984
Turmeric_75%      5.990   5.811   5.200   6.121   5.785   5.607
Turmeric_50%     13.366  12.651  11.758  14.631  12.113  14.689
Turmeric_25%     21.745  20.863  19.065  25.980  18.983  22.756
Turmeric_10%     26.290  25.269  23.447  32.022  23.505  28.013
Tulsi_75%         6.036   6.644   5.149   5.140   5.246   6.062
Tulsi_50%        11.544  14.753  12.028  12.632  14.029  13.801
Tulsi_25%        18.512  24.058  20.089  19.666  23.915  22.315
Tulsi_10%        22.222  29.946  24.751  24.211  29.478  27.079
Neem_75%          5.042   5.345   6.508   6.147   5.507   6.790
Neem_50%         10.862  11.639  15.952  15.465  13.683  18.225
Ne

In [None]:
# Visualize model predictions and performance
import plotly.express as px
from sklearn.metrics import confusion_matrix, mean_squared_error, r2_score

# Dilution Model Performance
y_pred_dilution = cv_scores['dilution']['predictions']
y_true_dilution = cv_scores['dilution']['true_values']

fig_dilution = px.scatter(
    x=y_true_dilution, 
    y=y_pred_dilution,
    labels={'x': 'True Dilution (%)', 'y': 'Predicted Dilution (%)'},
    title=f'Dilution Predictions (R² = {r2_score(y_true_dilution, y_pred_dilution):.3f})'
)
fig_dilution.add_trace(
    go.Scatter(x=[0, 100], y=[0, 100], mode='lines', 
               name='Perfect Prediction', line=dict(dash='dash'))
)
fig_dilution.show()

# Medicine Classification Performance
y_pred_medicine = cv_scores['medicine']['predictions']
y_true_medicine = cv_scores['medicine']['true_values']

conf_matrix = confusion_matrix(y_true_medicine, y_pred_medicine)
fig_conf = px.imshow(conf_matrix,
                     labels=dict(x="Predicted Medicine", y="True Medicine"),
                     x=medicines,
                     y=medicines,
                     title="Medicine Classification Confusion Matrix",
                     color_continuous_scale='Blues')
fig_conf.show()

# Effectiveness Model Performance
y_pred_effectiveness = cv_scores['effectiveness']['predictions']
y_true_effectiveness = cv_scores['effectiveness']['true_values']

fig_effectiveness = px.scatter(
    x=y_true_effectiveness, 
    y=y_pred_effectiveness,
    labels={'x': 'True Effectiveness', 'y': 'Predicted Effectiveness'},
    title=f'Effectiveness Predictions (R² = {r2_score(y_true_effectiveness, y_pred_effectiveness):.3f})'
)
fig_effectiveness.add_trace(
    go.Scatter(x=[0, 1], y=[0, 1], mode='lines', 
               name='Perfect Prediction', line=dict(dash='dash'))
)
fig_effectiveness.show()


Do not pass an `input_shape`/`input_dim` argument to a layer. When using Sequential models, prefer using an `Input(shape)` object as the first layer in the model instead.


Do not pass an `input_shape`/`input_dim` argument to a layer. When using Sequential models, prefer using an `Input(shape)` object as the first layer in the model instead.


Do not pass an `input_shape`/`input_dim` argument to a layer. When using Sequential models, prefer using an `Input(shape)` object as the first layer in the model instead.


Do not pass an `input_shape`/`input_dim` argument to a layer. When using Sequential models, prefer using an `Input(shape)` object as the first layer in the model instead.


Do not pass an `input_shape`/`input_dim` argument to a layer. When using Sequential models, prefer using an `Input(shape)` object as the first layer in the model instead.



TypeError: ForestRegressor.predict() got an unexpected keyword argument 'return_std'

# Real-time Prediction Example

Finally, let's demonstrate how to use the API for real-time predictions:
1. Create a sample sensor reading
2. Send it to the prediction endpoint
3. Visualize the results with confidence intervals

In [7]:
# Create a sample reading
sample_reading = {
    'R': 3.5,
    'S': 2.8,
    'T': 4.2,
    'U': 3.0,
    'V': 3.8,
    'W': 2.5,
    'Temperature': 25.0
}

# Convert to numpy array
sample_X = np.array([[
    sample_reading['R'],
    sample_reading['S'],
    sample_reading['T'],
    sample_reading['U'],
    sample_reading['V'],
    sample_reading['W'],
    sample_reading['Temperature']
]])

# Preprocess the sample
sample_X_norm = preprocessor.normalize_features(
    preprocessor.temperature_compensation(pd.DataFrame([sample_reading]))
).values

# Engineer features
sample_features = feature_engineer.engineer_features(
    sample_X_norm[:, :-1],
    sample_X_norm[:, -1],
    fit=False
)

# Generate predictions
sample_predictions = model.predict(sample_features['combined'])

# Display results
print("Prediction Results:")
print(f"Dilution: {sample_predictions['dilution']['predictions'][0]:.1f}% (confidence: {sample_predictions['dilution']['confidence'][0]:.2f})")
print(f"Medicine: {sample_predictions['medicine']['predictions'][0]} (confidence: {sample_predictions['medicine']['confidence'][0]:.2f})")
print(f"Effectiveness: {sample_predictions['effectiveness']['predictions'][0]:.2f} (confidence: {sample_predictions['effectiveness']['confidence'][0]:.2f})")

ValueError: X has 39 features, but RandomForestRegressor is expecting 7 features as input.