Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added Dask processor and Petastorm reader to train large datasets #970

Closed
wants to merge 79 commits into from

Commits on Oct 18, 2020

  1. Configuration menu
    Copy the full SHA
    3b8713d View commit details
    Browse the repository at this point in the history

Commits on Oct 22, 2020

  1. Configuration menu
    Copy the full SHA
    197314c View commit details
    Browse the repository at this point in the history
  2. Removed debug code

    tgaddair committed Oct 22, 2020
    Configuration menu
    Copy the full SHA
    6bf7083 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    dd52d5c View commit details
    Browse the repository at this point in the history

Commits on Oct 23, 2020

  1. Added DataProcessingEngine

    tgaddair committed Oct 23, 2020
    Configuration menu
    Copy the full SHA
    1f228b7 View commit details
    Browse the repository at this point in the history
  2. Fixed split

    tgaddair committed Oct 23, 2020
    Configuration menu
    Copy the full SHA
    12bfea7 View commit details
    Browse the repository at this point in the history
  3. Fixed API

    tgaddair committed Oct 23, 2020
    Configuration menu
    Copy the full SHA
    b39d372 View commit details
    Browse the repository at this point in the history

Commits on Oct 24, 2020

  1. Fixed data processing

    tgaddair committed Oct 24, 2020
    Configuration menu
    Copy the full SHA
    5b5fc60 View commit details
    Browse the repository at this point in the history
  2. Drop index

    tgaddair committed Oct 24, 2020
    Configuration menu
    Copy the full SHA
    8b2b594 View commit details
    Browse the repository at this point in the history
  3. Added Petastorm dataset

    tgaddair committed Oct 24, 2020
    Configuration menu
    Copy the full SHA
    b7f9546 View commit details
    Browse the repository at this point in the history
  4. Cleaned up dataset creation

    tgaddair committed Oct 24, 2020
    Configuration menu
    Copy the full SHA
    c952130 View commit details
    Browse the repository at this point in the history
  5. Added Dataset

    tgaddair committed Oct 24, 2020
    Configuration menu
    Copy the full SHA
    2c93b60 View commit details
    Browse the repository at this point in the history
  6. Train from dataset

    tgaddair committed Oct 24, 2020
    Configuration menu
    Copy the full SHA
    ea6b4a7 View commit details
    Browse the repository at this point in the history
  7. Fixed bugs

    tgaddair committed Oct 24, 2020
    Configuration menu
    Copy the full SHA
    b203fa6 View commit details
    Browse the repository at this point in the history

Commits on Oct 25, 2020

  1. Fixed string_utils

    tgaddair committed Oct 25, 2020
    Configuration menu
    Copy the full SHA
    6b3bb08 View commit details
    Browse the repository at this point in the history
  2. Fixed tests

    tgaddair committed Oct 25, 2020
    Configuration menu
    Copy the full SHA
    ef2a314 View commit details
    Browse the repository at this point in the history
  3. Fixed temp dataset

    tgaddair committed Oct 25, 2020
    Configuration menu
    Copy the full SHA
    9a743fe View commit details
    Browse the repository at this point in the history
  4. Added Backend

    tgaddair committed Oct 25, 2020
    Configuration menu
    Copy the full SHA
    a630f14 View commit details
    Browse the repository at this point in the history
  5. Plumb through backend

    tgaddair committed Oct 25, 2020
    Configuration menu
    Copy the full SHA
    945d56e View commit details
    Browse the repository at this point in the history
  6. Configuration menu
    Copy the full SHA
    2aab9c5 View commit details
    Browse the repository at this point in the history
  7. Configuration menu
    Copy the full SHA
    0a0a7c4 View commit details
    Browse the repository at this point in the history
  8. Configuration menu
    Copy the full SHA
    3419178 View commit details
    Browse the repository at this point in the history
  9. Fixed Pandas processing

    tgaddair committed Oct 25, 2020
    Configuration menu
    Copy the full SHA
    9d13c71 View commit details
    Browse the repository at this point in the history
  10. Added cache management

    tgaddair committed Oct 25, 2020
    Configuration menu
    Copy the full SHA
    95a7952 View commit details
    Browse the repository at this point in the history
  11. Fixed unit tests

    tgaddair committed Oct 25, 2020
    Configuration menu
    Copy the full SHA
    22b7538 View commit details
    Browse the repository at this point in the history
  12. Configuration menu
    Copy the full SHA
    fd7cbab View commit details
    Browse the repository at this point in the history
  13. Added numerical test

    tgaddair committed Oct 25, 2020
    Configuration menu
    Copy the full SHA
    b63b316 View commit details
    Browse the repository at this point in the history
  14. RayBackend -> DaskBackend

    tgaddair committed Oct 25, 2020
    Configuration menu
    Copy the full SHA
    0941ecd View commit details
    Browse the repository at this point in the history
  15. Fixed read_xsv

    tgaddair committed Oct 25, 2020
    Configuration menu
    Copy the full SHA
    77a59f9 View commit details
    Browse the repository at this point in the history

Commits on Oct 26, 2020

  1. Fixed set feature

    tgaddair committed Oct 26, 2020
    Configuration menu
    Copy the full SHA
    cab90a1 View commit details
    Browse the repository at this point in the history
  2. Untracked Netflix example

    tgaddair committed Oct 26, 2020
    Configuration menu
    Copy the full SHA
    755c204 View commit details
    Browse the repository at this point in the history
  3. Added Dask requirements

    tgaddair committed Oct 26, 2020
    Configuration menu
    Copy the full SHA
    a105e41 View commit details
    Browse the repository at this point in the history
  4. Fixed bag feature

    tgaddair committed Oct 26, 2020
    Configuration menu
    Copy the full SHA
    0cbb582 View commit details
    Browse the repository at this point in the history
  5. Fixed vector feature

    tgaddair committed Oct 26, 2020
    Configuration menu
    Copy the full SHA
    87f6e4b View commit details
    Browse the repository at this point in the history
  6. Fixed h3

    tgaddair committed Oct 26, 2020
    Configuration menu
    Copy the full SHA
    5e089d4 View commit details
    Browse the repository at this point in the history
  7. Fixed date

    tgaddair committed Oct 26, 2020
    Configuration menu
    Copy the full SHA
    4f6c0ba View commit details
    Browse the repository at this point in the history
  8. Fixed timeseries

    tgaddair committed Oct 26, 2020
    Configuration menu
    Copy the full SHA
    93cbccd View commit details
    Browse the repository at this point in the history

Commits on Oct 27, 2020

  1. Configuration menu
    Copy the full SHA
    0e93043 View commit details
    Browse the repository at this point in the history
  2. Fixed reshaping

    tgaddair committed Oct 27, 2020
    Configuration menu
    Copy the full SHA
    bb33fc0 View commit details
    Browse the repository at this point in the history
  3. Fixed tests

    tgaddair committed Oct 27, 2020
    Configuration menu
    Copy the full SHA
    7e8d3c3 View commit details
    Browse the repository at this point in the history
  4. Removed debug print

    tgaddair committed Oct 27, 2020
    Configuration menu
    Copy the full SHA
    8924ef6 View commit details
    Browse the repository at this point in the history
  5. Fixed image processing

    tgaddair committed Oct 27, 2020
    Configuration menu
    Copy the full SHA
    8e95be2 View commit details
    Browse the repository at this point in the history
  6. Added tests for exceptions

    tgaddair committed Oct 27, 2020
    Configuration menu
    Copy the full SHA
    b19ea58 View commit details
    Browse the repository at this point in the history
  7. meta_kwargs -> map_objects

    tgaddair committed Oct 27, 2020
    Configuration menu
    Copy the full SHA
    0afade5 View commit details
    Browse the repository at this point in the history
  8. Removed unused methods

    tgaddair committed Oct 27, 2020
    Configuration menu
    Copy the full SHA
    aabb582 View commit details
    Browse the repository at this point in the history
  9. Removed prints

    tgaddair committed Oct 27, 2020
    Configuration menu
    Copy the full SHA
    ac29b9d View commit details
    Browse the repository at this point in the history
  10. Reduced runtime

    tgaddair committed Oct 27, 2020
    Configuration menu
    Copy the full SHA
    e3e7a14 View commit details
    Browse the repository at this point in the history

Commits on Oct 28, 2020

  1. Configuration menu
    Copy the full SHA
    318da2f View commit details
    Browse the repository at this point in the history
  2. Added dask extra

    tgaddair committed Oct 28, 2020
    Configuration menu
    Copy the full SHA
    1a98f33 View commit details
    Browse the repository at this point in the history
  3. Fixed concatenation

    tgaddair committed Oct 28, 2020
    Configuration menu
    Copy the full SHA
    f625402 View commit details
    Browse the repository at this point in the history
  4. Fixed split empty dataset

    tgaddair committed Oct 28, 2020
    Configuration menu
    Copy the full SHA
    418600a View commit details
    Browse the repository at this point in the history
  5. Fixed subselect

    tgaddair committed Oct 28, 2020
    Configuration menu
    Copy the full SHA
    ed9451b View commit details
    Browse the repository at this point in the history
  6. Configuration menu
    Copy the full SHA
    a3de815 View commit details
    Browse the repository at this point in the history

Commits on Oct 30, 2020

  1. Moved meta.json

    tgaddair committed Oct 30, 2020
    Configuration menu
    Copy the full SHA
    eacfd63 View commit details
    Browse the repository at this point in the history
  2. Fixed cache key

    tgaddair committed Oct 30, 2020
    Configuration menu
    Copy the full SHA
    4d8e690 View commit details
    Browse the repository at this point in the history
  3. Updated Petastorm

    tgaddair committed Oct 30, 2020
    Configuration menu
    Copy the full SHA
    cd70992 View commit details
    Browse the repository at this point in the history
  4. Spawn Dask tests

    tgaddair committed Oct 30, 2020
    Configuration menu
    Copy the full SHA
    6c94f22 View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    cb8bb91 View commit details
    Browse the repository at this point in the history
  6. Configuration menu
    Copy the full SHA
    985b5bd View commit details
    Browse the repository at this point in the history
  7. Added tables

    tgaddair committed Oct 30, 2020
    Configuration menu
    Copy the full SHA
    60ad4f4 View commit details
    Browse the repository at this point in the history

Commits on Oct 31, 2020

  1. Fixed image features

    tgaddair committed Oct 31, 2020
    Configuration menu
    Copy the full SHA
    dff8461 View commit details
    Browse the repository at this point in the history
  2. Fixed string_utils.py

    tgaddair committed Oct 31, 2020
    Configuration menu
    Copy the full SHA
    0469097 View commit details
    Browse the repository at this point in the history
  3. Fixed kfold

    tgaddair committed Oct 31, 2020
    Configuration menu
    Copy the full SHA
    1922f35 View commit details
    Browse the repository at this point in the history
  4. Fixed test splits

    tgaddair committed Oct 31, 2020
    Configuration menu
    Copy the full SHA
    8952e23 View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    5057be6 View commit details
    Browse the repository at this point in the history
  6. Fixed test_visualization.py

    tgaddair committed Oct 31, 2020
    Configuration menu
    Copy the full SHA
    20351e0 View commit details
    Browse the repository at this point in the history
  7. Fixed Dask

    tgaddair committed Oct 31, 2020
    Configuration menu
    Copy the full SHA
    92d64c1 View commit details
    Browse the repository at this point in the history
  8. Fixed test_experiment.py

    tgaddair committed Oct 31, 2020
    Configuration menu
    Copy the full SHA
    25ab59b View commit details
    Browse the repository at this point in the history

Commits on Nov 6, 2020

  1. Configuration menu
    Copy the full SHA
    9f92c38 View commit details
    Browse the repository at this point in the history

Commits on Nov 18, 2020

  1. Resolved conflicts

    tgaddair committed Nov 18, 2020
    Configuration menu
    Copy the full SHA
    ccbaac9 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    60f1e59 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    53a427a View commit details
    Browse the repository at this point in the history
  4. Fixed Dask tests

    tgaddair committed Nov 18, 2020
    Configuration menu
    Copy the full SHA
    877b604 View commit details
    Browse the repository at this point in the history
  5. Refactored Batcher

    tgaddair committed Nov 18, 2020
    Configuration menu
    Copy the full SHA
    8b819be View commit details
    Browse the repository at this point in the history
  6. Configuration menu
    Copy the full SHA
    3268700 View commit details
    Browse the repository at this point in the history

Commits on Nov 26, 2020

  1. Resolved conflicts

    tgaddair committed Nov 26, 2020
    Configuration menu
    Copy the full SHA
    093e3d2 View commit details
    Browse the repository at this point in the history
  2. Fixed API

    tgaddair committed Nov 26, 2020
    Configuration menu
    Copy the full SHA
    d0e3aa1 View commit details
    Browse the repository at this point in the history
  3. Fixed index column

    tgaddair committed Nov 26, 2020
    Configuration menu
    Copy the full SHA
    2d6a231 View commit details
    Browse the repository at this point in the history
  4. Fixed reshaping

    tgaddair committed Nov 26, 2020
    Configuration menu
    Copy the full SHA
    b7da27d View commit details
    Browse the repository at this point in the history