Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ENH] technical roadmap 2023-2024 #4691

Open
fkiraly opened this issue Jun 11, 2023 · 3 comments
Open

[ENH] technical roadmap 2023-2024 #4691

fkiraly opened this issue Jun 11, 2023 · 3 comments
Labels
enhancement Adding new functionality good first issue Good for newcomers

Comments

@fkiraly
Copy link
Collaborator

fkiraly commented Jun 11, 2023

Umbrella issue for collecting, consolidating, and prioritizing roadmap items for the 2023-2024 increment.

How to contribute:

  • user or developer - community suggestions appreciated in this thread!
  • new to open source and want to contribute code? Check if you see sth interesting and get in touch on discord - at meet-ups, workstream meetings, or just chat
@fkiraly fkiraly added good first issue Good for newcomers enhancement Adding new functionality labels Jun 11, 2023
@fkiraly
Copy link
Collaborator Author

fkiraly commented Jun 11, 2023

From developer roadmap planning June 9

(to be used to prioritize)

@benHeid, @Bensliman2, @fkiraly, @JonathanBechtel, @yarnabrina

👍 = subjective prio (3 per participant)
✔️ = working on this (current plan)

backend

  • distributed & parallel support (:heavy_check_mark: if time)
    • end-to-end dask in boilerplate layer
    • framework support for parallelization within estimators
    • some performant implementation of frequently used estimators

forecasting

  • global & hierarchical forecasting

    • global 👍 👍
      • Global Forecasting Model Support
      • global forecasting framework support - panel with train instances different from test instances
    • hierarchical forecasting 👍
      • integrate few of the stuffs from Nixtla/hierarchicalforecast
  • deep learning forecasters 👍 👍 👍

    • Implement some deep learning forecasters

      • Prob(PNN) (sorry for proposing my own methods :D )
      • ES-RNN
      • DeepAR/NBeats/Transformers (or creating an adapter to libraries that already implement them)
      • Implement the DL forecasters LTSF-Linear/NLinear/DLinear ✔️
      • Implement Normalizing Flows DL Models for Time series
    • 3rd party package interface/adapters

      • add support for gluonts estimators
      • add support for pytorch-forecasting estimators
    • deep learning forecaster sub-base-class design - networks vs forecast interface

  • forecasting module performance issues 👍 ✔️

    • Make update_predict more performant.
      • Use batch prediction of DL forecasters for example instead of making a predict call for each time step separetely
    • reduction forecasters' performance...
      • finish redesign (bad performance)
      • or performant reimplementation
      • or skforecast interface
      • or nixtla reduction forecaster interface

new modules

  • Data Simulators/Generation 👍 👍 👍

    • Think about Deep Generative Models and how they can be integrated
      • Models are e.g. TimeGAN/COT-GAN, but also vanilla INNs, VAEs, and GANs.
      • Connection between such models and other tasks as forecasting, preprocessing, postprocessing, .
    • Metrics for synthetic data generation/data simulators
      • Discriminative/Predictive Scores,
  • annotation module design & framework - big project... 👍

    • subtask definition, base classes
    • initial population of estimators
    • reduction estimators, e.g., sliding window forecast and residual detector
  • probabilistic prediction interface, skpro etc 👍 👍✔️

    • more instances of BaseDistribution, tfp interface
    • skpro refactor; sktime-skpro functionality shift
    • normalizing flow for tabular data

pipelines

  • Finalise the Graphpipeline ✔️

benchmarking, tuning

  • benchmarking

    • partly covered by internship projects
    • benchmarking study with prophet
    • benchmark datasets interface
    • evaluation functionality
    • improvements to evaluate - by-sample support, parallelisation
  • hyperparameter tuning - "tuner" abstraction or similar 👍 ✔️

    • to be used in tuners across learning tasks
    • slottable optuna, hyperopt etc tuners, for all tasks

docs

  • tutorials, docs ✔️
    • tutorials for using transformations as feature columns in forecasters (note: may exist already, which I (AR) missed)
      • lag/lead
      • stl
      • tsfresh
      • etc.

testing, CI

  • CI performace ✔️
    • decrease test time
    • add manual workflows (available to admin/core dev+) to run all/specific tests on an estimator (wraper across check estimator) 👍

@fkiraly
Copy link
Collaborator Author

fkiraly commented Jun 11, 2023

From mini-roadmap session at summer onboardign event, June 7

@hazrulakmal, @Laprama, @fkiraly (only moderating), @kirilral, @TonyZhangkz

👍 = subjective prio (3 per participant)
✔️ = working on this (current plan)

develop classification / clustering (broad interest) 👍👍 👍

  • integrate the MrSEQL classifier back into sktime, it was removed due to a cython dependancy (there is an existing Issue tracking this)
  • other

forecasting ✔️ ✔️👍 👍👍 👍 👍

  • develop forecasting (probabilistic / Deep Learning)
  • develop forecasting models in general (probabalistic/classic ML). This is covered in the list of projects.

Benchmarking framework ✔️

Hyperparameter optimisation framework and tuning 👍 👍

  • Integrate optimisation libraries ex hyper-opt, sckit-optimze, etc into sktime. There's a issue about this being raised in github. There should be a discussion on how to integrate all these on a high level.
  • conduct tuning experiments such that we are able to set good default values as well as informative guides for potential users.

Dataset Module ✔️ 👍
Make the dataset library more comprehensive and to take into account real-world settings. Work on the design API to help visualise the information about the dataset.

extension of annotation module with anomaly detection algorithms, e.g. distance based anomaly detection, distribution methods, etc. 👍

@fkiraly fkiraly pinned this issue Jun 11, 2023
@fkiraly
Copy link
Collaborator Author

fkiraly commented Jul 28, 2023

@fkiraly's notes from discussions & roadmapping:

  • workstream - deep learning forecasters - new feature & architecture
  • workstream - benchmarking, data sets - new features & refactor, bugfixing
    • synergizes with DL
    • various issues with splitters
  • workstream - proba forecasting, skpro
    • presented at pydata in Sep
    • skpro rearchitecture ongoing, almost done
  • rework reduction forecasters, address bugs & user requests
    • various old and new issues (bugs, feature requests)
    • performance problem of rearchitected reducers

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Adding new functionality good first issue Good for newcomers
Projects
None yet
Development

No branches or pull requests

1 participant