Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Command line interface #36

Closed
wants to merge 38 commits into from
Closed

[WIP] Command line interface #36

wants to merge 38 commits into from

Conversation

henk789
Copy link
Member

@henk789 henk789 commented Sep 23, 2022

This PR addresses #4 and adds a functional CLI.

Added Features

The following functionalities are implemented:

Generating initial config with specified atoms

  • Create initial config with degree and atoms

Collecting DFT data

  • Collecting DFT data from single '.xyz' file/list of '.xyz' files
  • Automatically searching working directory for '.xyz' files
  • Converting '.xyz' files to data frame and storing as pickled file
  • Loading different DFT files
  • Split into train-test-validation
  • Adjusting column prefix

Featurization

  • Loading pickled files
  • Loading '.xyz' files
  • Automatic core count detection
  • Specifying r_min/r_max/res_map as dict
  • Specifying r_min/r_max/res_map as int/float and apply to all interactions
  • Exporting features to file
  • Exporting features to data frame
  • Knot strategies
  • Loading and dumping knots
  • Only training on energies

Fitting

  • Same parameters as with Python featurization
  • Exporting model to json file
  • Predict with fitted model
  • Test errors (RMSE and MAE)
  • Train on subset of featurized data
  • Predict on subset of featurized data

Misc

  • Automatic installation for easy use
  • Default settings file
  • Settings file from bspline_config

YAML parameters not yet used:

  • seed
  • data
    • db_path
    • max_per_file
    • min_diff
    • generate_stats
    • vasp_pressure
    • sources
      • pattern
  • basis
    • fit_offsets
    • mask_trim
    • knot_strategy
    • knots_path
    • load_knots
    • dump_knots
  • features
    • db_path
    • fit_forces
    • column_prefix
  • learning
    • splits_path

YAML parameters added:

  • verbose
  • data
    • pickle_path
    • train_pickle
    • test_pickle
  • features
    • batch_size
    • table_template
  • learning
    • batch_size

Usage

The CLI enables quick generation of models. The tungsten example can be executed in three lines:

> uf3_cli config 3 W
> uf3_cli collect options.yaml
> uf3_cli featurize options.yaml
> uf3_cli fit options.yaml

with the required file names and training settings specified in options.yaml

The code is robust towards errors and performs type conversion where possible.

henk789 and others added 30 commits September 12, 2022 15:48
Prepeare first PyPi release
Add improved slice visualization for three-body visualization and not…
@henk789 henk789 closed this Jul 21, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants