Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Linear Regression Health Costs Calculator (Certification Project) #270

Closed
beaucarnes opened this issue Nov 27, 2019 · 10 comments
Closed

Linear Regression Health Costs Calculator (Certification Project) #270

beaucarnes opened this issue Nov 27, 2019 · 10 comments

Comments

@beaucarnes
Copy link
Member

Create project from the Python machine learning certification.

@beaucarnes beaucarnes added this to Not Started in Back Seven Projects Nov 27, 2019
@beaucarnes beaucarnes changed the title Linear Regression Fuel Economy Calculator (Certification Project) Linear Regression Health Costs Calculator (Certification Project) Feb 5, 2020
@maikroservice
Copy link

maikroservice commented Mar 11, 2020

Hello - I just tried my hands on this challenge and have found a couple of things that I am not sure about:

In data science one mostly wants to use the simplest possible model in order to understand the influence of certain parameters better (linear regression over neural net) and there are a lot of implementations that come to mind before using keras/tf for such a model - especially sklearn.

The next thing is the prediction of 'monetary values' is often done using a log10/log approximation since hardly anyone cares about the cents but more about a correct ball-park number e.g.
image - the closer to the bell curve it is the 'easier' it is to predict for the model

if one does it this way, then the predictions using either a sklearn.linear_model.LinearRegression or Ridge or Lasso or XGBRegressor look something like this :
image

and the metrics are similar to these:

Variance-score (R^2): 0.8546
Mean squared error: 0.0247
Root mean squared error: 0.1570
log root mean squared error: 0.0140
mean absolute error: 0.0856
Accuracy: 92.16%

@scissorsneedfoodtoo
Copy link
Contributor

@lefthand3r, thank you for checking out this project and your thorough explanation. Those diagrams are really helpful! These are all things we definitely want to consider before we release this project.

@beaucarnes
Copy link
Member Author

@lefthand3r Thanks for reviewing this and giving your input. Would you be interested in helping to redo this challenge to address the issues you brought up?

@maikroservice
Copy link

@beaucarnes / @scissorsneedfoodtoo Thank you very much for the feedback! It feels awesome to contribute to open source when there is welcoming people like the two of you!

I would love to contribute to the challenge and will also take a look at the other machine learning challenges later this week.

@beaucarnes please tell me how to help.

@beaucarnes
Copy link
Member Author

@lefthand3r Can you update the instruction and solution to include your suggestions including sklearn?

@maikroservice
Copy link

It will take me a couple of days but I will suggest something

@scissorsneedfoodtoo
Copy link
Contributor

@lefthand3r, awesome, looking forward to your suggestions.

@maikroservice
Copy link

Please find my first draft of the challenge here:

https://colab.research.google.com/drive/1W_7_ztx8ahU_8MwUKq1_9Pbjobke-Tf7

@scissorsneedfoodtoo
Copy link
Contributor

@lefthand3r, thank you for your patience and for all of your hard work on this draft! I know very little about data science and machine learning, so I won't be able to make very helpful suggestions with this project. But I read through your descriptions, ran all the cells, and everything LGTM as far as I can tell.

Could you take a look at this @beaucarnes?

@beaucarnes
Copy link
Member Author

@maikroservice Thanks for putting this together. I really like what you've done. I'm trying to think of the best way to turn this into a project for people to complete while guiding them in the correct direction. Are you available to get on a call to discuss this since I think you would have some good insights.

@moT01 moT01 closed this as completed Jul 26, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
No open projects
Development

No branches or pull requests

4 participants