-
Notifications
You must be signed in to change notification settings - Fork 0
Open
Labels
Description
@nray @eiviani-lanl just migrating this tracking issue to the public repo via copy/paste of current state.
- finalize the edRVFL PR (DOC, CI: lint weights.py with
numpydocin the CI #26) - finalize the regression PR (ENH, TST: Add rtol to classifier and a test. #44)
- other code changes/additions -- please expand as needed, there will likely be some
- decide on how to "split off" the library code and make it public while this repo remains private? preservation of
githistory vs. keeping research plans to ourselves for now, etc. There are various logistical annoyances related to that - Add an appropriate project license, considering what we import/link to, what FCI approved, and all that (may need to talk to FCI again if we change from original agreement, and to sort out the multiple sub repos for library vs. research code, etc.)
- (Tyler) PyPI release process -- we may not need binaries if we don't have compiled code, but should at least update
pyproject.tomlto match modern standards for support metadata, etc. -
conda-forgerelease process - (Tyler) Portability -- we should probably be conformant with SPEC 0 -- this likely boils down to supporting Python
3.12-3.14and testing for those in the CI - Do an official/GitHub immutable release when the (likely separate?) library repo is public, and assign it a Zenodo DOI
- (Tyler) Make sure our documentation/docstrings are pretty decent -- i.e., could follow https://numpydoc.readthedocs.io/en/latest/format.html which has a validation/linter I believe (https://numpydoc.readthedocs.io/en/latest/validation.html); could also run doctests to make sure docs stay up to date (code examples are valid over time, etc.)
- Hosting our documentation somewhere (GitHub pages, ReadTheDocs, whatever suits) - Navamita
- Pick a journal -- we are NOT eligible for JOSS (https://joss.theoj.org/) because we don't have 6 months of sustained public engagement on i.e., GitHub
- Draft the paper somewhere -- I can create a private Overleaf link that up to 10 people can edit -- I believe this should be "ok" since it would be similar level of privacy to this GitHub repo (can we loop Kyle/Kostas in a bit, or not for the code?) -- https://www.overleaf.com/3111751118mvxmcpqnqxmw#eb42c5
- Get an LA-UR for the publication once we have a draft
- Draft developer docs on how to add additional RVFL architectures in the future? Is that process clear?
- (Emma) Remove
StandardScalerusage from the estimators proper - what about the
OneHotEncoderusage--double check that as well vs. requiring properly encoded input (I think one-hot requirement was more fundamental, but let's think about that) - Emma would like to hoist the weight initializers out of the classes - Emma
- Emma would like to see some more input validation/error checking
- Add a regression equivalent of
EnsembleRVFLClassifier()- Emma - Allow for "online learning" via
partial_fitsince Grafo supports this,sklearnMLPClassifiersupports it, and it may be necessary for training with very large design matrices - Emma - Follow up on difference in results for regression between GrafoRVFL and our regressor (https://github.com/lanl/ascr_rvfl/pull/54) - Navamita
- Follow up with Kostas regarding the iterative scheme we need to test (SGD vs GD) for comparison with exact solve and clarify the details of the numerical test - Navamita
- Split off appropriate parts of the library code (which parts?) for incorporation into a fork/branch of
scikit-learn, and iterate with the team using a forked repo ofscikit-learn(try to add all 3 of us in the condensed/redone commit history) - In the
scikit-learnfork, work with the team to draft the PR description we'll eventually use to propose our addition--this should be well-written markdown that includes of examples of relevant papers that have been well cited, a clear explanation of what RVFL is, and so on, to convince thescikit-learndevelopers that this work is of sufficiently broad interest to be useful at the base of the Python ML ecosystem - in the
scikit-learnfork, adjust our docstrings to match the exact standards they use - in the
scikit-learnfork, make sure we pass their CI with their full testsuite/linting requirements, etc. - decide on the name of our open source library project (
rvflmay not suit if we also offer ELM, etc.?), then open source it under LANL org
Reactions are currently unavailable