<a href="https://colab.research.google.com/github/nonyeezeh/Research-Project-Code/blob/main/code_4.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Imports

In [25]:
import numpy as np
import pandas as pd
import networkx as nx
import plotly.graph_objects as go

import pandas as pd
import tensorflow as tf
from sklearn.preprocessing import OneHotEncoder

# Research Question

##### In this research, how does the predictive accuracy of a neural network compare to that of a Bayesian network in predicting stock prices, particularly when trained on varying sample sizes of data generated by a Bayesian network?

# Expectations

1. With larger training samples, the neural network's performance is expected to improve due to having sufficient data for effective learning, while the Bayesian network may outperform the neural network on smaller samples.
2. The Bayesian network is anticipated to show more consistent performance across different sample sizes due to its probabilistic nature and reliance on prior knowledge.
3. The neural network might require more computational resources and time to train, especially with increasing sample sizes, compared to the Bayesian network.

# Data: 3 Nodes, 500 Samples

## Bayesian Network Data Generation

In [None]:
# Define the number of samples
num_samples = 500

# Define the possible values for each variable
values = {
    'IR': ['low', 'medium', 'high'],
    'EI': ['poor', 'average', 'good'],
    'SP': ['decrease', 'stable', 'increase']
}

# Functions to sample each variable with probabilities
def sample_IR():
    probabilities = np.random.dirichlet(np.ones(len(values['IR'])))
    rounded_probs = [round(p, 2) for p in probabilities]
    chosen_index = np.argmax(probabilities)
    chosen_value = values['IR'][chosen_index]
    return chosen_value, rounded_probs

def sample_EI(ir=None):
    probabilities = np.random.dirichlet(np.ones(len(values['EI'])))
    rounded_probs = [round(p, 2) for p in probabilities]
    chosen_index = np.argmax(probabilities)
    chosen_value = values['EI'][chosen_index]
    return chosen_value, rounded_probs

def sample_SP(ir, ei):
    # Generate probabilities based on a joint influence of IR and EI
    probabilities = np.random.dirichlet(np.ones(len(values['SP'])))
    rounded_probs = [round(p, 2) for p in probabilities]
    chosen_index = np.argmax(probabilities)
    chosen_value = values['SP'][chosen_index]
    return chosen_value, rounded_probs

# Randomly determine the structure (edges)
edges = []
if np.random.rand() > 0.5:
    edges.append(('IR', 'EI'))
if np.random.rand() > 0.5:
    edges.append(('EI', 'IR'))
if np.random.rand() > 0.5:
    edges.append(('IR', 'SP'))
if np.random.rand() > 0.5:
    edges.append(('EI', 'SP'))

# Ensure there's at least one edge to SP (either from IR or EI)
if not any(edge[1] == 'SP' for edge in edges):
    edges.append(np.random.choice([('IR', 'SP'), ('EI', 'SP')]))

# Generate the data and capture probabilities
data = []
probabilities_data = []

for _ in range(num_samples):
    ir, ir_probs = sample_IR()
    ei, ei_probs = sample_EI(ir)
    sp, sp_probs = sample_SP(ir, ei)

    data.append([ir, ei, sp])
    probabilities_data.append([
        ','.join(map(str, ir_probs)),
        ir,
        ','.join(map(str, ei_probs)),
        ei,
        ','.join(map(str, sp_probs)),
        sp
    ])

# Convert to DataFrame for the main data
df = pd.DataFrame(data, columns=['IR', 'EI', 'SP'])

# Save the main data to a CSV file
df.to_csv('bn_data_structure.csv', index=False)

# Convert to DataFrame for probabilities and chosen values
probabilities_df = pd.DataFrame(probabilities_data, columns=[
    'IR_Probabilities', 'Chosen_IR',
    'EI_Probabilities', 'Chosen_EI',
    'SP_Probabilities', 'Chosen_SP'
])

# Save the probabilities and chosen values to a CSV file
#probabilities_df.to_csv('bn_probabilities.csv', index=False)

# Display the first 5 rows of each DataFrame
print("Generated data:")
print(df.head())

print("\nProbabilities and chosen values:")
print(probabilities_df.head())

print("\nMain data and probabilities saved successfully.")

#-----------------------------------------------------------------------------------------------------

# Extract the necessary columns from the probabilities_df DataFrame
test_data = probabilities_df[['Chosen_IR', 'Chosen_EI', 'Chosen_SP']]

# Find the SP Probability corresponding to the chosen SP
test_data['SP_Probability'] = probabilities_df.apply(
    lambda row: float(row['SP_Probabilities'].split(',')[values['SP'].index(row['Chosen_SP'])]),
    axis=1
)

# Rename the columns to match the intended output format
test_data.rename(columns={
    'Chosen_IR': 'IR',
    'Chosen_EI': 'EI',
    'Chosen_SP': 'SP'
}, inplace=True)

# Save the test data to a new CSV file with only the specified columns
test_data.to_csv('bn_test_data_for_NN.csv', index=False)

# Print confirmation
print("Test data saved successfully as bn_test_data_for_NN.csv.")

In [None]:
# Visualize the Bayesian Network structure using Plotly
G = nx.DiGraph()

# Add nodes and edges
G.add_edges_from(edges)

# Extract node positions for Plotly
pos = nx.spring_layout(G)
edge_x = []
edge_y = []
arrow_x = []
arrow_y = []

for edge in G.edges():
    x0, y0 = pos[edge[0]]
    x1, y1 = pos[edge[1]]
    edge_x.append(x0)
    edge_x.append(x1)
    edge_x.append(None)
    edge_y.append(y0)
    edge_y.append(y1)
    edge_y.append(None)

    # Move arrows closer to the target node (x1, y1)
    arrow_x.append(0.90 * x1 + 0.10 * x0)
    arrow_y.append(0.90 * y1 + 0.10 * x0)


edge_trace = go.Scatter(
    x=edge_x, y=edge_y,
    line=dict(width=2, color='gray'),
    hoverinfo='none',
    mode='lines')

node_x = []
node_y = []
node_text = []
node_color = []

for node in G.nodes():
    x, y = pos[node]
    node_x.append(x)
    node_y.append(y)
    node_text.append(node)

    # Highlight the SP node with a different color
    if node == 'SP':
        node_color.append('pink')
    else:
        node_color.append('purple')

node_trace = go.Scatter(
    x=node_x, y=node_y,
    mode='markers+text',
    text=node_text,
    textposition="top center",
    hoverinfo='text',
    marker=dict(size=50, color=node_color, line=dict(width=2)))

# Adding the arrow heads, placing them correctly outside the nodes
#arrow_trace = go.Scatter(
    #x=arrow_x, y=arrow_y,
    #mode='markers',
    #marker=dict(size=10, color='black', symbol='triangle-up'),
    #hoverinfo='none'
#)

#fig = go.Figure(data=[edge_trace, node_trace, arrow_trace],
fig = go.Figure(data=[edge_trace, node_trace],
             layout=go.Layout(
                showlegend=False,
                hovermode='closest',
                margin=dict(b=20, l=20, r=20, t=50),  # Adjusted margins to fit the title
                xaxis=dict(showgrid=False, zeroline=False),
                yaxis=dict(showgrid=False, zeroline=False),
                plot_bgcolor='aliceblue')
                )

# Update layout to include a proper title
fig.update_layout(title_text="Bayesian Network Structure", title_x=0.5)

fig.show()

## Neural Network

### Neural Network Training

In [None]:
# Load the data
data = pd.read_csv('bn_data_structure.csv')

# Preprocess the data
encoder = OneHotEncoder(sparse=False)
X = encoder.fit_transform(data[['IR', 'EI']])  # One-hot encode IR and EI
y = encoder.fit_transform(data[['SP']])        # One-hot encode SP

# Build and train the neural network
model = tf.keras.Sequential([
    tf.keras.layers.Dense(10, input_dim=X.shape[1], activation='relu'),
    tf.keras.layers.Dense(y.shape[1], activation='softmax')
])
model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])
model.fit(X, y, epochs=50, batch_size=16, verbose=1)  # Train with 50 epochs

# Get predicted probabilities from the NN
y_pred_probs = model.predict(X)

# Convert probabilities to predicted SP categories
y_pred = np.argmax(y_pred_probs, axis=1)
predicted_sp = encoder.categories_[-1][y_pred]

# Get the final probability for the predicted SP
predicted_sp_probs = [y_pred_probs[i, idx] for i, idx in enumerate(y_pred)]

# Prepare the output DataFrame
output_df = pd.DataFrame({
    'IR': data['IR'],
    'EI': data['EI'],
    'Actual SP': data['SP'],
    'Predicted SP': predicted_sp,
    'Predicted SP Probability': predicted_sp_probs
})

# Display the first 10 rows of the output
print(output_df.head(10))

# Optionally save the output to a CSV file
output_df.to_csv('nn_output_with_probabilities.csv', index=False)

### NN and BN (Ground Truth) MSE

In [None]:

# Load the test data with BN probabilities
bn_test_data = pd.read_csv('bn_test_data_for_NN.csv')

# Load the NN output data with NN probabilities
nn_output_data = pd.read_csv('nn_output_with_probabilities.csv')

# Ensure the data is aligned by checking for matching IR, EI, and SP
assert np.all(bn_test_data['IR'] == nn_output_data['IR'])
assert np.all(bn_test_data['EI'] == nn_output_data['EI'])
assert np.all(bn_test_data['SP'] == nn_output_data['Actual SP'])

# Calculate the squared differences between BN and NN probabilities
bn_prob = bn_test_data['SP_Probability']
nn_prob = nn_output_data['Predicted SP Probability']
squared_diffs = (bn_prob - nn_prob) ** 2

# Combine the relevant columns into a DataFrame for display
comparison_df = pd.DataFrame({
    'BN Probability': bn_prob,
    'NN Probability': round(nn_prob,2),
    'Squared Difference': round(squared_diffs,2)
})

# Display the first few rows to see the comparison
print("Comparison of BN and NN probabilities (first few rows):")
print(comparison_df.head(10))

# Calculate the Mean Squared Error (MSE)
mse = squared_diffs.mean()

# Display the MSE value
print(f"\nMean Squared Error (MSE) between BN and NN probabilities: {round(mse,2)}")

# ```Fully Optimised Code```

In [71]:
# Define the range of sample sizes
sample_sizes = range(500, 10001, 500)  # 500, 1000, 1500, ..., 10000
mse_values = []

for num_samples in sample_sizes:
    print(f"Processing {num_samples} samples...")

    # -----------------------------------------------
    # Generate Data and Capture Probabilities
    # -----------------------------------------------
    # Generate the data and capture probabilities
    data = []
    probabilities_data = []

    for _ in range(num_samples):
        ir, ir_probs = sample_IR()
        ei, ei_probs = sample_EI(ir)
        sp, sp_probs = sample_SP(ir, ei)

        data.append([ir, ei, sp])
        probabilities_data.append([
            ','.join(map(str, ir_probs)),
            ir,
            ','.join(map(str, ei_probs)),
            ei,
            ','.join(map(str, sp_probs)),
            sp
        ])

    # Convert to DataFrame for the main data
    df = pd.DataFrame(data, columns=['IR', 'EI', 'SP'])

    # Convert to DataFrame for probabilities and chosen values
    probabilities_df = pd.DataFrame(probabilities_data, columns=[
        'IR_Probabilities', 'Chosen_IR',
        'EI_Probabilities', 'Chosen_EI',
        'SP_Probabilities', 'Chosen_SP'
    ])

    # Prepare test data for NN
    test_data = probabilities_df[['Chosen_IR', 'Chosen_EI', 'Chosen_SP']]

    # Find the SP Probability corresponding to the chosen SP
    test_data['SP_Probability'] = probabilities_df.apply(
        lambda row: float(row['SP_Probabilities'].split(',')[values['SP'].index(row['Chosen_SP'])]),
        axis=1
    )

    # Rename the columns to match the intended output format
    test_data.rename(columns={
        'Chosen_IR': 'IR',
        'Chosen_EI': 'EI',
        'Chosen_SP': 'SP'
    }, inplace=True)

    # -----------------------------------------------
    # Visualize the Bayesian Network structure for certain sample sizes
    # -----------------------------------------------
    if num_samples in [500, 10000]:  # Only visualize for the first and last sample sizes
        G = nx.DiGraph()

        # Add nodes and edges
        G.add_edges_from(edges)

        # Extract node positions for Plotly
        pos = nx.spring_layout(G)
        edge_x = []
        edge_y = []
        arrow_x = []
        arrow_y = []

        for edge in G.edges():
            x0, y0 = pos[edge[0]]
            x1, y1 = pos[edge[1]]
            edge_x.append(x0)
            edge_x.append(x1)
            edge_x.append(None)
            edge_y.append(y0)
            edge_y.append(y1)
            edge_y.append(None)

            # Move arrows closer to the target node (x1, y1)
            arrow_x.append(0.90 * x1 + 0.10 * x0)
            arrow_y.append(0.90 * y1 + 0.10 * x0)

        edge_trace = go.Scatter(
            x=edge_x, y=edge_y,
            line=dict(width=2, color='gray'),
            hoverinfo='none',
            mode='lines')

        node_x = []
        node_y = []
        node_text = []
        node_color = []

        for node in G.nodes():
            x, y = pos[node]
            node_x.append(x)
            node_y.append(y)
            node_text.append(node)

            # Highlight the SP node with a different color
            if node == 'SP':
                node_color.append('pink')
            else:
                node_color.append('purple')

        node_trace = go.Scatter(
            x=node_x, y=node_y,
            mode='markers+text',
            text=node_text,
            textposition="top center",
            hoverinfo='text',
            marker=dict(size=50, color=node_color, line=dict(width=2)))

        fig = go.Figure(data=[edge_trace, node_trace],
                        layout=go.Layout(
                            showlegend=False,
                            hovermode='closest',
                            margin=dict(b=20, l=20, r=20, t=50),  # Adjusted margins to fit the title
                            xaxis=dict(showgrid=False, zeroline=False),
                            yaxis=dict(showgrid=False, zeroline=False),
                            plot_bgcolor='aliceblue')
                        )

        # Update layout to include a proper title
        fig.update_layout(title_text=f"Bayesian Network Structure for {num_samples} Samples", title_x=0.5)
        fig.show()

    # -----------------------------------------------
    # Train Neural Network and Predict
    # -----------------------------------------------
    # Preprocess the data
    encoder = OneHotEncoder(sparse=False)
    X = encoder.fit_transform(df[['IR', 'EI']])  # One-hot encode IR and EI
    y = encoder.fit_transform(df[['SP']])        # One-hot encode SP

    # Build and train the neural network
    model = tf.keras.Sequential([
        tf.keras.layers.Dense(10, input_dim=X.shape[1], activation='relu'),
        tf.keras.layers.Dense(y.shape[1], activation='softmax')
    ])
    model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])
    model.fit(X, y, epochs=50, batch_size=16, verbose=0)  # Train with 50 epochs

    # Get predicted probabilities from the NN
    y_pred_probs = model.predict(X)

    # Convert probabilities to predicted SP categories
    y_pred = np.argmax(y_pred_probs, axis=1)
    predicted_sp = encoder.categories_[-1][y_pred]

    # Get the final probability for the predicted SP
    predicted_sp_probs = [y_pred_probs[i, idx] for i, idx in enumerate(y_pred)]

    # Prepare the output DataFrame
    output_df = pd.DataFrame({
        'IR': df['IR'],
        'EI': df['EI'],
        'Actual SP': df['SP'],
        'Predicted SP': predicted_sp,
        'Predicted SP Probability': predicted_sp_probs
    })

    # -----------------------------------------------
    # Calculate MSE
    # -----------------------------------------------
    # Ensure the data is aligned by checking for matching IR, EI, and SP
    assert np.all(test_data['IR'] == output_df['IR'])
    assert np.all(test_data['EI'] == output_df['EI'])
    assert np.all(test_data['SP'] == output_df['Actual SP'])

    # Calculate the squared differences between BN and NN probabilities
    bn_prob = test_data['SP_Probability']
    nn_prob = output_df['Predicted SP Probability']
    squared_diffs = (bn_prob - nn_prob) ** 2

    # Calculate the Mean Squared Error (MSE)
    mse = squared_diffs.mean()
    mse_values.append(mse)

    print(f"MSE for {num_samples} samples: {mse}")

# -----------------------------------------------
# Save MSE values to a CSV file
# -----------------------------------------------
mse_df = pd.DataFrame({
    'Sample Size': list(sample_sizes),
    'MSE': mse_values
})
mse_df.to_csv('mse_values.csv', index=False)

# -----------------------------------------------
# Plot the MSE values using Plotly
# -----------------------------------------------
fig = go.Figure()

fig.add_trace(go.Scatter(
    x=list(sample_sizes),
    y=mse_values,
    mode='lines+markers',
    marker=dict(size=10),
    line=dict(width=2),
    name='MSE'
))

fig.update_layout(
    title='MSE of NN vs. Sample Size',
    xaxis_title='Sample Size',
    yaxis_title='Mean Squared Error (MSE)',
    plot_bgcolor='aliceblue',
    hovermode='x'
)

Processing 500 samples...




A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy



A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy




`sparse` was renamed to `sparse_output` in version 1.2 and will be removed in 1.4. `sparse_output` is ignored unless you leave `sparse` to its default value.


`sparse` was renamed to `sparse_output` in version 1.2 and will be removed in 1.4. `sparse_output` is ignored unless you leave `sparse` to its default value.


Do not pass an `input_shape`/`input_dim` argument to a layer. When using Sequential models, prefer using an `Input(shape)` object as the first layer in the model instead.



[1m16/16[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 5ms/step
MSE for 500 samples: 0.08425308090430358
Processing 1000 samples...




A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy



A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy


`sparse` was renamed to `sparse_output` in version 1.2 and will be removed in 1.4. `sparse_output` is ignored unless you leave `sparse` to its default value.


`sparse` was renamed to `sparse_output` in version 1.2 and will be removed in 1.4. `sparse_output` is ignored unless you leave `sparse` to its default value.


Do not pass an `input_shape`/`input_dim` argument to a layer. When using Sequential models, prefer using an `Input(shape)` object as the first layer in the model instead.



[1m32/32[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 2ms/step
MSE for 1000 samples: 0.07660535330442077
Processing 1500 samples...




A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy



A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy


`sparse` was renamed to `sparse_output` in version 1.2 and will be removed in 1.4. `sparse_output` is ignored unless you leave `sparse` to its default value.


`sparse` was renamed to `sparse_output` in version 1.2 and will be removed in 1.4. `sparse_output` is ignored unless you leave `sparse` to its default value.


Do not pass an `input_shape`/`input_dim` argument to a layer. When using Sequential models, prefer using an `Input(shape)` object as the first layer in the model instead.



[1m47/47[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 2ms/step
MSE for 1500 samples: 0.08521264836412971
Processing 2000 samples...




A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy



A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy


`sparse` was renamed to `sparse_output` in version 1.2 and will be removed in 1.4. `sparse_output` is ignored unless you leave `sparse` to its default value.


`sparse` was renamed to `sparse_output` in version 1.2 and will be removed in 1.4. `sparse_output` is ignored unless you leave `sparse` to its default value.


Do not pass an `input_shape`/`input_dim` argument to a layer. When using Sequential models, prefer using an `Input(shape)` object as the first layer in the model instead.



[1m63/63[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 1ms/step
MSE for 2000 samples: 0.08628036180853092
Processing 2500 samples...




A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy



A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy


`sparse` was renamed to `sparse_output` in version 1.2 and will be removed in 1.4. `sparse_output` is ignored unless you leave `sparse` to its default value.


`sparse` was renamed to `sparse_output` in version 1.2 and will be removed in 1.4. `sparse_output` is ignored unless you leave `sparse` to its default value.


Do not pass an `input_shape`/`input_dim` argument to a layer. When using Sequential models, prefer using an `Input(shape)` object as the first layer in the model instead.



[1m79/79[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 1ms/step
MSE for 2500 samples: 0.080979454034319
Processing 3000 samples...




A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy



A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy


`sparse` was renamed to `sparse_output` in version 1.2 and will be removed in 1.4. `sparse_output` is ignored unless you leave `sparse` to its default value.


`sparse` was renamed to `sparse_output` in version 1.2 and will be removed in 1.4. `sparse_output` is ignored unless you leave `sparse` to its default value.


Do not pass an `input_shape`/`input_dim` argument to a layer. When using Sequential models, prefer using an `Input(shape)` object as the first layer in the model instead.



[1m94/94[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 2ms/step
MSE for 3000 samples: 0.08466549903052378
Processing 3500 samples...




A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy



A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy


`sparse` was renamed to `sparse_output` in version 1.2 and will be removed in 1.4. `sparse_output` is ignored unless you leave `sparse` to its default value.


`sparse` was renamed to `sparse_output` in version 1.2 and will be removed in 1.4. `sparse_output` is ignored unless you leave `sparse` to its default value.


Do not pass an `input_shape`/`input_dim` argument to a layer. When using Sequential models, prefer using an `Input(shape)` object as the first layer in the model instead.



[1m110/110[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 1ms/step
MSE for 3500 samples: 0.08601321541064771
Processing 4000 samples...




A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy



A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy


`sparse` was renamed to `sparse_output` in version 1.2 and will be removed in 1.4. `sparse_output` is ignored unless you leave `sparse` to its default value.


`sparse` was renamed to `sparse_output` in version 1.2 and will be removed in 1.4. `sparse_output` is ignored unless you leave `sparse` to its default value.


Do not pass an `input_shape`/`input_dim` argument to a layer. When using Sequential models, prefer using an `Input(shape)` object as the first layer in the model instead.



[1m125/125[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 1ms/step
MSE for 4000 samples: 0.08723283284211482
Processing 4500 samples...




A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy



A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy


`sparse` was renamed to `sparse_output` in version 1.2 and will be removed in 1.4. `sparse_output` is ignored unless you leave `sparse` to its default value.


`sparse` was renamed to `sparse_output` in version 1.2 and will be removed in 1.4. `sparse_output` is ignored unless you leave `sparse` to its default value.


Do not pass an `input_shape`/`input_dim` argument to a layer. When using Sequential models, prefer using an `Input(shape)` object as the first layer in the model instead.



[1m141/141[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 2ms/step
MSE for 4500 samples: 0.0835805382323841
Processing 5000 samples...




A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy



A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy


`sparse` was renamed to `sparse_output` in version 1.2 and will be removed in 1.4. `sparse_output` is ignored unless you leave `sparse` to its default value.


`sparse` was renamed to `sparse_output` in version 1.2 and will be removed in 1.4. `sparse_output` is ignored unless you leave `sparse` to its default value.


Do not pass an `input_shape`/`input_dim` argument to a layer. When using Sequential models, prefer using an `Input(shape)` object as the first layer in the model instead.



[1m157/157[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 1ms/step
MSE for 5000 samples: 0.0829570367178473
Processing 5500 samples...




A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy



A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy


`sparse` was renamed to `sparse_output` in version 1.2 and will be removed in 1.4. `sparse_output` is ignored unless you leave `sparse` to its default value.


`sparse` was renamed to `sparse_output` in version 1.2 and will be removed in 1.4. `sparse_output` is ignored unless you leave `sparse` to its default value.


Do not pass an `input_shape`/`input_dim` argument to a layer. When using Sequential models, prefer using an `Input(shape)` object as the first layer in the model instead.



[1m172/172[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 1ms/step
MSE for 5500 samples: 0.08630281084396332
Processing 6000 samples...




A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy



A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy


`sparse` was renamed to `sparse_output` in version 1.2 and will be removed in 1.4. `sparse_output` is ignored unless you leave `sparse` to its default value.


`sparse` was renamed to `sparse_output` in version 1.2 and will be removed in 1.4. `sparse_output` is ignored unless you leave `sparse` to its default value.


Do not pass an `input_shape`/`input_dim` argument to a layer. When using Sequential models, prefer using an `Input(shape)` object as the first layer in the model instead.



[1m188/188[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 1ms/step
MSE for 6000 samples: 0.08792689663890434
Processing 6500 samples...




A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy



A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy


`sparse` was renamed to `sparse_output` in version 1.2 and will be removed in 1.4. `sparse_output` is ignored unless you leave `sparse` to its default value.


`sparse` was renamed to `sparse_output` in version 1.2 and will be removed in 1.4. `sparse_output` is ignored unless you leave `sparse` to its default value.


Do not pass an `input_shape`/`input_dim` argument to a layer. When using Sequential models, prefer using an `Input(shape)` object as the first layer in the model instead.



[1m204/204[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 1ms/step
MSE for 6500 samples: 0.08513897173437206
Processing 7000 samples...




A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy



A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy


`sparse` was renamed to `sparse_output` in version 1.2 and will be removed in 1.4. `sparse_output` is ignored unless you leave `sparse` to its default value.


`sparse` was renamed to `sparse_output` in version 1.2 and will be removed in 1.4. `sparse_output` is ignored unless you leave `sparse` to its default value.


Do not pass an `input_shape`/`input_dim` argument to a layer. When using Sequential models, prefer using an `Input(shape)` object as the first layer in the model instead.



[1m219/219[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 1ms/step
MSE for 7000 samples: 0.08899939339010916
Processing 7500 samples...




A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy



A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy


`sparse` was renamed to `sparse_output` in version 1.2 and will be removed in 1.4. `sparse_output` is ignored unless you leave `sparse` to its default value.


`sparse` was renamed to `sparse_output` in version 1.2 and will be removed in 1.4. `sparse_output` is ignored unless you leave `sparse` to its default value.


Do not pass an `input_shape`/`input_dim` argument to a layer. When using Sequential models, prefer using an `Input(shape)` object as the first layer in the model instead.



[1m235/235[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 1ms/step
MSE for 7500 samples: 0.08924037651604028
Processing 8000 samples...




A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy



A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy


`sparse` was renamed to `sparse_output` in version 1.2 and will be removed in 1.4. `sparse_output` is ignored unless you leave `sparse` to its default value.


`sparse` was renamed to `sparse_output` in version 1.2 and will be removed in 1.4. `sparse_output` is ignored unless you leave `sparse` to its default value.


Do not pass an `input_shape`/`input_dim` argument to a layer. When using Sequential models, prefer using an `Input(shape)` object as the first layer in the model instead.



[1m250/250[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 1ms/step
MSE for 8000 samples: 0.08648561807846097
Processing 8500 samples...




A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy



A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy


`sparse` was renamed to `sparse_output` in version 1.2 and will be removed in 1.4. `sparse_output` is ignored unless you leave `sparse` to its default value.


`sparse` was renamed to `sparse_output` in version 1.2 and will be removed in 1.4. `sparse_output` is ignored unless you leave `sparse` to its default value.


Do not pass an `input_shape`/`input_dim` argument to a layer. When using Sequential models, prefer using an `Input(shape)` object as the first layer in the model instead.



[1m266/266[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 1ms/step
MSE for 8500 samples: 0.08874001870902944
Processing 9000 samples...




A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy



A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy


`sparse` was renamed to `sparse_output` in version 1.2 and will be removed in 1.4. `sparse_output` is ignored unless you leave `sparse` to its default value.


`sparse` was renamed to `sparse_output` in version 1.2 and will be removed in 1.4. `sparse_output` is ignored unless you leave `sparse` to its default value.


Do not pass an `input_shape`/`input_dim` argument to a layer. When using Sequential models, prefer using an `Input(shape)` object as the first layer in the model instead.



[1m282/282[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 1ms/step
MSE for 9000 samples: 0.08847766665824906
Processing 9500 samples...




A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy



A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy


`sparse` was renamed to `sparse_output` in version 1.2 and will be removed in 1.4. `sparse_output` is ignored unless you leave `sparse` to its default value.


`sparse` was renamed to `sparse_output` in version 1.2 and will be removed in 1.4. `sparse_output` is ignored unless you leave `sparse` to its default value.


Do not pass an `input_shape`/`input_dim` argument to a layer. When using Sequential models, prefer using an `Input(shape)` object as the first layer in the model instead.



[1m297/297[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 1ms/step
MSE for 9500 samples: 0.08762609381685298
Processing 10000 samples...




A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy



A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy




`sparse` was renamed to `sparse_output` in version 1.2 and will be removed in 1.4. `sparse_output` is ignored unless you leave `sparse` to its default value.


`sparse` was renamed to `sparse_output` in version 1.2 and will be removed in 1.4. `sparse_output` is ignored unless you leave `sparse` to its default value.


Do not pass an `input_shape`/`input_dim` argument to a layer. When using Sequential models, prefer using an `Input(shape)` object as the first layer in the model instead.



[1m313/313[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 1ms/step
MSE for 10000 samples: 0.08855105018797062
