Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

typo found in t81_558_class_14_03_anomaly.ipynb #51

Closed
alik604 opened this issue Mar 3, 2020 · 3 comments
Closed

typo found in t81_558_class_14_03_anomaly.ipynb #51

alik604 opened this issue Mar 3, 2020 · 3 comments

Comments

@alik604
Copy link

alik604 commented Mar 3, 2020

First of all thank you for posting notebooks. It's nice concise way for me to test out a new concept :)

The notebook t81_558_class_14_03_anomaly.ipynb has typos in the last cell

score1 = np.sqrt(metrics.mean_squared_error(pred,x_normal_test))
print(f"Insample Normal Score (RMSE): {score1}".format(score1))
# score is the test set
# score2 is the whole dataset (- attacks) 

Only the 2nd and 3rd to last print statements need to be changed

@jeffheaton
Copy link
Owner

jeffheaton commented Mar 8, 2020

You are basically saying the out of sample/in-sample prints were flipped? I made that adjustment and also got rid of the stray .format(score1).

@alik604
Copy link
Author

alik604 commented Mar 8, 2020

I recall they being two typos, I saw only one now

df_normal = df[normal_mask]
x_normal = df_normal.values

x_normal_train, x_normal_test = train_test_split(x_normal, test_size=0.25, random_state=42)

pred = model.predict(x_normal)
score2 = np.sqrt(metrics.mean_squared_error(pred,x_normal))
print(f"Insample Normal Score (RMSE): {score2}")

Regarding the last line, to my understanding, Insample implies it being from the training set. However, this appears to be all (normal) date

Sorry if I'm mistaken. Please feel free to close the request when ever you wish

@jeffheaton
Copy link
Owner

Added more description. Training occurred entirely on normal data so the insample and out of sample both come from just normals. The final RMSE reports the error on the non-normal, which is higher, indicating an anomoly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants