Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft: Integrate feature preprocessor as step in SKLL learner pipeline #569

Closed
wants to merge 2 commits into from

Conversation

mulhod
Copy link
Contributor

@mulhod mulhod commented Jul 18, 2022

The basic idea is that one of the outputs of running RSMTool should be a model file that can be loaded and used immediately with the same type of raw features used to run the original experiment. This PR adds a named step to the SKLL learner pipeline and then also saves the pipeline separately.

In [1]: import joblib

In [2]: model = joblib.load(open("output/ASAP2.pipeline.model", "rb"))

In [3]: ! head -2 train.csv
ID,DISCOURSE,ORGANIZATION,GRAMMAR,MECHANICS,LENGTH,score,score2
RESPONSE_1,4.93806460126142,-0.0846667513334603,-0.316793975540994,4.65591397849462,279,3,3

In [4]: ! head -2 output/ASAP2_pred_train.csv
spkitemid,raw,sc1,scale,raw_trim,raw_trim_round,scale_trim,scale_trim_round
RESPONSE_1,3.467158796079344,3.0,3.487689689334681,3.467158796079344,3,3.487689689334681,3

In [5]: model.predict([{"DISCOURSE": 4.93806460126142, "ORGANIZATION": -0.0846667513334603, "GRAMMAR": -0.316793975540994, "MECHANICS": 4.65591397849462}])
Out[5]: array([3.4671588])

@pep8speaks
Copy link

pep8speaks commented Jul 18, 2022

Hello @mulhod! Thanks for updating this PR.

There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🎉

Comment last updated at 2022-07-18 19:11:35 UTC

@mulhod mulhod requested a review from desilinguist July 18, 2022 17:54
@mulhod mulhod marked this pull request as draft July 18, 2022 17:54
rsmtool/modeler.py Outdated Show resolved Hide resolved
@codecov
Copy link

codecov bot commented Jul 18, 2022

Codecov Report

Merging #569 (b173a92) into main (933d17b) will decrease coverage by 0.09%.
The diff coverage is 80.00%.

❗ Current head b173a92 differs from pull request most recent head 9c2b546. Consider uploading reports for the commit 9c2b546 to get more accurate results

@@            Coverage Diff             @@
##             main     #569      +/-   ##
==========================================
- Coverage   93.14%   93.05%   -0.10%     
==========================================
  Files          31       31              
  Lines        4525     4552      +27     
==========================================
+ Hits         4215     4236      +21     
- Misses        310      316       +6     
Impacted Files Coverage Δ
rsmtool/modeler.py 96.36% <80.00%> (-1.22%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 933d17b...9c2b546. Read the comment docs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants