[Doc] First push of the developer documentation #127

franchuterivera · 2021-03-08T19:33:18Z

First Push for the developer documentation.

franchuterivera · 2021-03-18T11:57:32Z

Also, add documentation for the run identifiers

ravinkohli · 2021-03-23T18:20:09Z

docs/dev.rst

+In other words, each of the individual models fitted by SMAC are (and comply) with Scikit-Learn pipeline and framework. For example, when a pipeline is fitted,
+we use pickle to save it to disk as stated `here <https://scikit-learn.org/stable/modules/model_persistence.html>`_. SMAC runs an optimization loop that proposes new
+configurations based on bayesian optimization, which comply with the package `ConfigSpace <https://automl.github.io/ConfigSpace/master/>`_. These configurations are
+translated to a pipeline configuration, fitted and saved to disc using the function evaluator `ExecuteTaFuncWithQueue`. The later is basically a worker that that


It should be the latter is basically

ravinkohli · 2021-04-07T11:26:59Z

docs/dev.rst

+Developer Documentation
+=======================
+
+This documentation summarizes how the AutoPyTorch code works, and it is meant to guide developers


and is meant as a guide for the developers to help contribute to it. Maybe?

ravinkohli · 2021-04-07T11:27:50Z

docs/dev.rst

+
+AutoPyTorch relies on the `SMAC <https://automl.github.io/SMAC3/master/>`_ library to build individual models,
+which are later ensembled together using ensemble selection by `Caruana et al. (2004) <https://dl.acm.org/doi/pdf/10.1145/1015330.1015432>`_.
+Therefore, there are two main parts of the code: `AutoMLSMBO`, which is our interface to the SMAC package, and


AutoMLSMBO, which acts as an interface to SMAC

ravinkohli · 2021-04-07T11:37:40Z

docs/dev.rst

+==========================
+
+AutoPyTorch relies on Scikit-Learn `Pipeline <https://scikit-learn.org/stable/modules/generated/sklearn.pipeline.Pipeline.html>`_ to build an individual algorithm.
+In other words, each of the individual models fitted by SMAC are (and comply) with Scikit-Learn pipeline and framework. For example, when a pipeline is fitted,


A pipeline can consist of various preprocessing steps including- imputation, encoding, scaling, and feature preprocessing for tabular data followed by the setup of the training algorithm which includes building a network embedding for categorical features, a network backbone that is responsible for feature extraction, a network head that is responsible for decision making and choosing an appropriate Learning Rate scheduler, optimisation algorithm, and initialisation of the network parameters.

The preprocessing steps are aggregated into a Sklearn ColumnTransformer to maintain the order of the columns and the column transformer is wrapped into a TabularColumnTransformer class which complies with a torchvision transform. Based on the size of the dataset, we apply this transformer to the whole of the dataset in the EarlyPreprocessing node if the dataset is small enough or it is added as a transform to the data loader object that allows transforming the dataset batchwise.

This can also be moved to the end (AutoML part) where you talk about the pipeline

ravinkohli · 2021-04-07T11:44:27Z

docs/dev.rst

+In other words, each of the individual models fitted by SMAC are (and comply) with Scikit-Learn pipeline and framework. For example, when a pipeline is fitted,
+we use pickle to save it to disk as stated `here <https://scikit-learn.org/stable/modules/model_persistence.html>`_. SMAC runs an optimization loop that proposes new
+configurations based on bayesian optimization, which comply with the package `ConfigSpace <https://automl.github.io/ConfigSpace/master/>`_. These configurations are
+translated to a pipeline configuration, fitted and saved to disc using the function evaluator `ExecuteTaFuncWithQueue`. The latter is basically a worker that that


These configurations are then translated to AutoPyTorch pipelines, fitted, and finally saved to disc

ravinkohli · 2021-04-07T11:45:05Z

docs/dev.rst

+configurations based on bayesian optimization, which comply with the package `ConfigSpace <https://automl.github.io/ConfigSpace/master/>`_. These configurations are
+translated to a pipeline configuration, fitted and saved to disc using the function evaluator `ExecuteTaFuncWithQueue`. The latter is basically a worker that that
+reads a dataset from disc, fits a pipeline, and collect the performance result which is communicated back to the main process via a Queue. This worker manages
+resources using `Pynisher <https://github.com/automl/pynisher>`_, and it usually does so by creating a new process.


memory and time constraints, maybe?

ravinkohli · 2021-04-07T11:47:30Z

docs/dev.rst

+reads a dataset from disc, fits a pipeline, and collect the performance result which is communicated back to the main process via a Queue. This worker manages
+resources using `Pynisher <https://github.com/automl/pynisher>`_, and it usually does so by creating a new process.
+
+The Scikit-learn pipeline inherits from the `BaseEstimator <https://scikit-learn.org/stable/modules/generated/sklearn.base.BaseEstimator.html>`_, which implies that we have to honor the `Scikit-Learn development Guidelines <https://scikit-learn.org/stable/developers/develop.html>`_. Of particular interest is that any estimator must define as attributes, the arguments that the class constructor receives (see `get_params and set_params` from the above documentation).


Particularly, the arguments that the class constructor of any estimator must be defined as attributes of the class. (see blah blah)

ravinkohli · 2021-04-07T11:49:39Z

docs/dev.rst

+
+The Scikit-learn pipeline inherits from the `BaseEstimator <https://scikit-learn.org/stable/modules/generated/sklearn.base.BaseEstimator.html>`_, which implies that we have to honor the `Scikit-Learn development Guidelines <https://scikit-learn.org/stable/developers/develop.html>`_. Of particular interest is that any estimator must define as attributes, the arguments that the class constructor receives (see `get_params and set_params` from the above documentation).
+
+Regarding multiprocessing, AutoPyTorch and SMAC work with `Dask.distributed <https://distributed.dask.org/en/latest/>`_. We only submits jobs to Dask up to the number of 


To speed up the search, AutoPyTorch and SMAC use Dask.distributed which provides an excellent(or good) infrastructure to effectively multiprocess the computations involved.

ravinkohli · 2021-04-07T11:49:53Z

docs/dev.rst

+Regarding multiprocessing, AutoPyTorch and SMAC work with `Dask.distributed <https://distributed.dask.org/en/latest/>`_. We only submits jobs to Dask up to the number of 
+workers, and wait for a worker to be available before continuing.
+
+At the end of a SMAC runs, the results will be available in the `temporary_directory` provided to the API, in particular inside of the `<temporary_directory>/smac3-output/run_<SEED>/` directory. One can debug


run*, in particular inside the '<temporary blah blah

ravinkohli · 2021-04-07T11:50:48Z

docs/dev.rst

+
+At the end of a SMAC runs, the results will be available in the `temporary_directory` provided to the API, in particular inside of the `<temporary_directory>/smac3-output/run_<SEED>/` directory. One can debug
+the performance of the individual models using the file `runhistory.json` located in this area. Every individual model will be stored in `<temporary_directory>/.autoPyTorch/runs`. 
+In this later directory we store the fitted model (during cross-validation we store a single Voting Classifier/Regressor, which is the soft voting outcome of k-Fold cross-validation), the Out-Of-Fold


In this 'runs' directory,

ravinkohli

Hey, this documentation overall looks good and provides important details of the working of AutoPyTorch. I have made some comments that may improve the information we give to the developers and also minor grammar updates. All these are optional, please choose what you also see as appropriate

ravinkohli · 2021-04-07T15:29:28Z

docs/dev.rst

+Developer Documentation
+=======================
+
+This documentation summarizes how the AutoPyTorch code works, and is meant as a guide for the developers to help contribute to it. .


there is an extra .

ravinkohli

There is a minor typo but otherwise it looks great.

…ation (#127)

* New refactor code. Initial push * Allow specifying the network type in include (automl#78) * Allow specifying the network type in include * Fix test flake 8 * fix test api * increased time for func eval in cros validation * Addressed comments Co-authored-by: Ravin Kohli <kohliravin7@gmail.com> * Search space update (automl#80) * Added Hyperparameter Search space updates * added test for search space update * Added Hyperparameter Search space updates * added test for search space update * Added hyperparameter search space updates to network, trainer and improved check for search space updates * Fix mypy, flake8 * Fix tests and silly mistake in base_pipeline * Fix flake * added _cs_updates to dummy component * fixed indentation and isinstance comment * fixed silly error * Addressed comments from fransisco * added value error for search space updates * ADD tests for setting range of config space * fic utils search space update * Make sure the performance of pipeline is at least 0.8 * Early stop fixes * Network Cleanup (automl#81) * removed old supported_tasks dictionary from heads, added some docstrings and some small fixes * removed old supported_tasks attribute and updated doc strings in base backbone and base head components * removed old supported_tasks attribute from network backbones * put time series backbones in separate files, add doc strings and refactored search space arguments * split image networks into separate files, add doc strings and refactor search space * fix typo * add an intial simple backbone test similar to the network head test * fix flake8 * fixed imports in backbones and heads * added new network backbone and head tests * enabled tests for adding custom backbones and heads, added required properties to base head and base backbone * First documentation * Default to ubuntu-18.04 * Comment enhancements * Feature preprocessors, Loss strategies (automl#86) * ADD Weighted loss * Now? * Fix tests, flake, mypy * Fix tests * Fix mypy * change back sklearn requirement * Assert for fast ica sklearn bug * Forgot to add skip * Fix tests, changed num only data to float * removed fast ica * change num only dataset * Increased number of features in num only * Increase timeout for pytest * ADD tensorboard to requirement * Fix bug with small_preprocess * Fix bug in pytest execution * Fix tests * ADD error is raised if default not in include * Added dynamic search space for deciding n components in feature preprocessors, add test for pipeline include * Moved back to random configs in tabular test * Added floor and ceil and handling of logs * Fix flake * Remove TruncatedSVD from cs if num numerical ==1 * ADD flakyness to network accuracy test * fix flake * remove cla to pytest * Validate the input to autopytorch * Bug fixes after rebase * Move to new scikit learn * Remove dangerous convert dtype * Try to remove random float error again and make data pickable * Tets pickle on versions higher than 3.6 * Tets pickle on versions higher than 3.6 * Comment fixes * Adding tabular regression pipeline (automl#85) * removed old supported_tasks dictionary from heads, added some docstrings and some small fixes * removed old supported_tasks attribute and updated doc strings in base backbone and base head components * removed old supported_tasks attribute from network backbones * put time series backbones in separate files, add doc strings and refactored search space arguments * split image networks into separate files, add doc strings and refactor search space * fix typo * add an intial simple backbone test similar to the network head test * fix flake8 * fixed imports in backbones and heads * added new network backbone and head tests * enabled tests for adding custom backbones and heads, added required properties to base head and base backbone * adding tabular regression pipeline * fix flake8 * adding tabular regression pipeline * fix flake8 * fix regression test * fix indentation and comments, undo change in base network * pipeline fitting tests now check the expected output shape dynamically based on the input data * refactored trainer tests, added trainer test for regression * remove regression from mixup unitest * use pandas unique instead of numpy * [IMPORTANT] added proper target casting based on task type to base trainer * adding tabular regression task to api * adding tabular regression example, some small fixes * new/more tests for tabular regression * fix mypy and flake8 errors from merge * fix issues with new weighted loss and regression tasks * change tabular column transformer to use net fit_dictionary_tabular fixture * fixing tests, replaced num_classes with output_shape * fixes after merge * adding voting regressor wrapper * fix mypy and flake * updated example * lower r2 target * address comments * increasing timeout * increase number of labels in test_losses because it occasionally failed if one class was not in the labels * lower regression lr in score test until seeding properly works * fix randomization in feature validator test * Make sure the performance of pipeline is at least 0.8 * Early stop fixes * Network Cleanup (automl#81) * removed old supported_tasks dictionary from heads, added some docstrings and some small fixes * removed old supported_tasks attribute and updated doc strings in base backbone and base head components * removed old supported_tasks attribute from network backbones * put time series backbones in separate files, add doc strings and refactored search space arguments * split image networks into separate files, add doc strings and refactor search space * fix typo * add an intial simple backbone test similar to the network head test * fix flake8 * fixed imports in backbones and heads * added new network backbone and head tests * enabled tests for adding custom backbones and heads, added required properties to base head and base backbone * First documentation * Default to ubuntu-18.04 * Comment enhancements * Feature preprocessors, Loss strategies (automl#86) * ADD Weighted loss * Now? * Fix tests, flake, mypy * Fix tests * Fix mypy * change back sklearn requirement * Assert for fast ica sklearn bug * Forgot to add skip * Fix tests, changed num only data to float * removed fast ica * change num only dataset * Increased number of features in num only * Increase timeout for pytest * ADD tensorboard to requirement * Fix bug with small_preprocess * Fix bug in pytest execution * Fix tests * ADD error is raised if default not in include * Added dynamic search space for deciding n components in feature preprocessors, add test for pipeline include * Moved back to random configs in tabular test * Added floor and ceil and handling of logs * Fix flake * Remove TruncatedSVD from cs if num numerical ==1 * ADD flakyness to network accuracy test * fix flake * remove cla to pytest * Validate the input to autopytorch * Bug fixes after rebase * Move to new scikit learn * Remove dangerous convert dtype * Try to remove random float error again and make data pickable * Tets pickle on versions higher than 3.6 * Tets pickle on versions higher than 3.6 * Comment fixes * [REFACTORING]: no change in the functionalities, inputs, returns * Modified an error message * [Test error fix]: Fixed the error caused by flake8 * [Test error fix]: Fixed the error caused by flake8 * FIX weighted loss issue (automl#94) * Changed tests for losses and how weighted strategy is handled in the base trainer * Addressed comments from francisco * Fix training test * Re-arranged tests and moved test_setup to pytest * Reduced search space for dummy forward backward pass of backbones * Fix typo * ADD Doc string to loss function * Logger enhancements * show_models * Move to spawn * Adding missing logger line * Feedback from comments * ADD_109 * No print allow * [PR response]: deleted unneeded changes from merge and fixed the doc-string. * fixed the for loop in type_check based on samuel's review * deleted blank space pointed out by flake8 * Try no autouse * handle nans in categorical columns (automl#118) * handle nans in categorical columns * Fixed error in self dtypes * Addressed comments from francisco * Forgot to commit * Fix flake * Embedding layer (automl#91) * work in progress * in progress * Working network embedding * ADD tests for network embedding * Removed ordinal encoder * Removed ordinal encoder * Add seed for test_losses for reproducibility * Addressed comments * fix flake * fix test import training * ADD_109 * No print allow * Fix tests and move to boston * Debug issue with python 3.6 * Debug for python3.6 * Run only debug file * work in progress * in progress * Working network embedding * ADD tests for network embedding * Removed ordinal encoder * Removed ordinal encoder * Addressed comments * fix flake * fix test import training * Fix tests and move to boston * Debug issue with python 3.6 * Run only debug file * Debug for python3.6 * print paths of parent dir * Trying to run examples * Trying to run examples * Add success model * Added parent directory for printing paths * Try no autouse * print log file to see if backend is saving num run * Setup logger in backend * handle nans in categorical columns (automl#118) * handle nans in categorical columns * Fixed error in self dtypes * Addressed comments from francisco * Forgot to commit * Fix flake * try without embeddings * work in progress * in progress * Working network embedding * ADD tests for network embedding * Removed ordinal encoder * Removed ordinal encoder * Addressed comments * fix flake * fix test import training * Fix tests and move to boston * Debug issue with python 3.6 * Run only debug file * Debug for python3.6 * work in progress * in progress * Working network embedding * ADD tests for network embedding * print paths of parent dir * Trying to run examples * Trying to run examples * Add success model * Added parent directory for printing paths * print log file to see if backend is saving num run * Setup logger in backend * try without embeddings * no embedding for python 3.6 * Deleted debug example * Fix test for evaluation * Deleted utils file Co-authored-by: chico <francisco.rivera.valverde@gmail.com> * Fixes to address automlbenchmark problems * Fix trajectory file output * modified the doc-string in TransformSubset in base_dataset.py * change config_id to config_id+1 (automl#129) * move to a minimization problem (automl#113) * move to a minimization problem * Fix missing test loss file * Missed regression * More robust test * Try signal timeout * Kernel PCA failures * Feedback from Ravin * Better debug msg * Feedback from comments * Doc string request * Feedback from comments * Enhanced doc string * FIX_123 (automl#133) * FIX_123 * Better debug msg * at least 1 config in regression * Return self in _fit() * Adds more examples to customise AutoPyTorch. (automl#124) * 3 examples plus doc update * Forgot the examples * Added example for resampling strategy * Update example worflow * Fixed bugs in example and resampling strategies * Addressed comments * Addressed comments * Addressed comments from shuhei, better documentation * [Feat] Better traditional pipeline cutoff time (automl#141) * [Feat] Better traditional pipeline cutoff time * Fix unit testing * Better failure msg * bug fix catboost * Feedback from Ravin * First batch of feedback from comments * Missed examples * Syntax fix * Hyperparameter Search Space updates now with constant and include ability (automl#146) * In progress, add_hyperparameter * Added SearchSpace working functionality * Working search space update with test for __choice__ and fix flake * fixed mypy bug and bug in making constant float hyperparameters * Add test for fitting pipeline with constant updates * fix flake * bug in int for feature preprocessors and minor bugs in hyperparameter search space fixed * Forgot to add a file * Addressed comments, better documentation and better tests for search space updates * Fix flake * [Bug] Fix random halt problems on traditional pipelines (automl#147) * [feat] Fix random halt problems on traditional pipelines * Documentation update * Fix flake * Flake due to kernel pca errors * Run history traditional (automl#121) * In progress, issue with failed traditional * working traditional classifiers * Addressed comments from francisco * Changed test loop in test_api * Add .autopytorch runs back again * Addressed comments, better documentation and dict for runhistory * Fix flake * Fix tests and add additional run info for crossval * fix tests for train evaluator and api * Addressed comments * Addressed comments * Addressed comments from shuhei, removed deleting from additioninfo * [FIX] Enables backend to track the num run (automl#162) * AA_151 * doc the peek attr * [ADD] Relax constant pipeline performance * [Doc] First push of the developer documentation (automl#127) * First push of the developer documentation * Feedback from Ravin * Document scikit-learn develop guide * Feedback from Ravin * Delete extra point * Refactoring base dataset splitting functions (automl#106) * [Fork from automl#105] Made CrossValFuncs and HoldOutFuncs class to group the functions * Modified time_series_dataset.py to be compatible with resampling_strategy.py * [fix]: back to the renamed version of CROSS_VAL_FN from temporal SplitFunc typing. * fixed flake8 issues in three files * fixed the flake8 issues * [refactor] Address the francisco's comments * [refactor] Adress the francisco's comments * [refactor] Address the doc-string issue in TransformSubset class * [fix] Address flake8 issues * [fix] Fix flake8 issue * [fix] Fix mypy issues raised by github check * [fix] Fix a mypy issue * [fix] Fix a contradiction in holdout_stratified_validation Since stratified splitting requires to shuffle by default and it raises error in the github check, I fixed this issue. * [fix] Address the francisco's review * [fix] Fix a mypy issue tabular_dataset.py * [fix] Address the francisco's comment about the self.dataset_name Since we would to use the dataset name which does not have any name, I decided to get self.dataset_name back to Optional[str]. * [fix] Fix mypy issues * [Fix] Refactor development reproducibility (automl#172) * [Fix] pass random state to randomized algorithms * [Fix] double instantiation of random state * [fix] Flaky for sample configuration * [FIX] Runtime warning * [FIX] hardcoded budget * [FIX] flake * [Fix] try forked * [Fix] try forked * [FIX] budget * [Fix] missing random_state in trainer * [Fix] overwrite in random_state * [FIX] fix seed in splits * [Rebase] * [FIX] Update cv score after split num change * [FIX] CV split * [ADD] Extra visualization example (automl#189) * [ADD] Extra visualization example * Update docs/manual.rst Co-authored-by: Ravin Kohli <13005107+ravinkohli@users.noreply.github.com> * Update docs/manual.rst Co-authored-by: Ravin Kohli <13005107+ravinkohli@users.noreply.github.com> * [Fix] missing version * Update examples/tabular/40_advanced/example_visualization.py Co-authored-by: Ravin Kohli <13005107+ravinkohli@users.noreply.github.com> * [FIX] make docs more clear to the user Co-authored-by: Ravin Kohli <13005107+ravinkohli@users.noreply.github.com> * [Fix] docs links (automl#201) * [Fix] docs links * Update README.md Co-authored-by: Ravin Kohli <13005107+ravinkohli@users.noreply.github.com> * Update examples check * Remove tmp in examples Co-authored-by: Ravin Kohli <13005107+ravinkohli@users.noreply.github.com> * [Refactor] Use the backend implementation from automl common (automl#185) * [ADD] First push to enable common backend * Fix unit test * Try public https * [FIX] conftest prefix * [fix] unit test * [FIX] Fix fixture in score * [Fix] pytest collection * [FIX] flake * [FIX] regression also! * Update README.md Co-authored-by: Ravin Kohli <13005107+ravinkohli@users.noreply.github.com> * Update .gitmodules Co-authored-by: Ravin Kohli <13005107+ravinkohli@users.noreply.github.com> * [FIX] Regression time * Make flaky in case memout doesn't happen * Refacto development automl common backend debug (#2) * [ADD] debug information * [FIX] try fork for more stability Co-authored-by: Ravin Kohli <13005107+ravinkohli@users.noreply.github.com> * [DOC] Adds documentation to the abstract evaluator (automl#160) * DOC_153 * Changes from Ravin * [FIX] improve clarity of msg in commit * [FIX] Update Readme (automl#208) * Reduce run time of the test (automl#205) * In progress, changing te4sts * Reduce time for tests * Fix flake in tests * Patch train in other tests also * Address comments from shuhei and fransisco: * Move base training to pytest * Fix flake in tests * forgot to pass n_samples * stupid error * Address comments from shuhei, remove hardcoding and fix bug in dummy eval function * Skip ensemble test for python >=3.7 and introduce random state for feature processors * fix flake * Remove example workflow * Remove from __init__ in feature preprocessing * [refactor] Getting dataset properties from the dataset object (automl#164) * Use get_required_dataset_info of the dataset when needing required info for getting dataset requirements * Fix flake * Fix bug in getting dataset requirements * Added doc string to explain dataset properties * Update doc string in utils pipeline * Change ubuntu version in docs workflow (automl#237) * Add dist check worflow (automl#238) * [feature] Greedy Portfolio (automl#200) * initial configurations added * In progress, adding flag in search function * Adds documentation, example and fixes setup.py * Address comments from shuhei, change run_greedy to portfolio_selection * address comments from fransisco, movie portfolio to configs * Address comments from fransisco, add tests for greedy portfolio and tests * fix flake tests * Simplify portfolio selection * Update autoPyTorch/optimizer/smbo.py Co-authored-by: Francisco Rivera Valverde <44504424+franchuterivera@users.noreply.github.com> * Address comments from fransisco, path exception handling and test * fix flake * Address comments from shuhei * fix bug in setup.py * fix tests in base trainer evaluate, increase n samples and add seed * fix tests in base trainer evaluate, increase n samples (fix) Co-authored-by: Francisco Rivera Valverde <44504424+franchuterivera@users.noreply.github.com> * [ADD] Forkserver as default multiprocessing strategy (automl#223) * First push of forkserver * [Fix] Missing file * [FIX] mypy * [Fix] renam choice to init * [Fix] Unit test * [Fix] bugs in examples * [Fix] ensemble builder * Update autoPyTorch/pipeline/components/preprocessing/image_preprocessing/normalise/__init__.py Co-authored-by: Ravin Kohli <13005107+ravinkohli@users.noreply.github.com> * Update autoPyTorch/pipeline/components/preprocessing/image_preprocessing/normalise/__init__.py Co-authored-by: Ravin Kohli <13005107+ravinkohli@users.noreply.github.com> * Update autoPyTorch/pipeline/components/preprocessing/tabular_preprocessing/encoding/__init__.py Co-authored-by: Ravin Kohli <13005107+ravinkohli@users.noreply.github.com> * Update autoPyTorch/pipeline/components/preprocessing/image_preprocessing/normalise/__init__.py Co-authored-by: Ravin Kohli <13005107+ravinkohli@users.noreply.github.com> * Update autoPyTorch/pipeline/components/preprocessing/tabular_preprocessing/feature_preprocessing/__init__.py Co-authored-by: Ravin Kohli <13005107+ravinkohli@users.noreply.github.com> * Update autoPyTorch/pipeline/components/preprocessing/tabular_preprocessing/scaling/__init__.py Co-authored-by: Ravin Kohli <13005107+ravinkohli@users.noreply.github.com> * Update autoPyTorch/pipeline/components/setup/network_head/__init__.py Co-authored-by: Ravin Kohli <13005107+ravinkohli@users.noreply.github.com> * Update autoPyTorch/pipeline/components/setup/network_initializer/__init__.py Co-authored-by: Ravin Kohli <13005107+ravinkohli@users.noreply.github.com> * Update autoPyTorch/pipeline/components/setup/network_embedding/__init__.py Co-authored-by: Ravin Kohli <13005107+ravinkohli@users.noreply.github.com> * [FIX] improve doc-strings * Fix rebase Co-authored-by: Ravin Kohli <13005107+ravinkohli@users.noreply.github.com> * [ADD] Get incumbent config (automl#175) * In progress get_incumbent_results * [Add] get_incumbent_results to base task, changed additional info in abstract evaluator, and tests * In progress addressing fransisco's comment * Proper check for include_traditional * Fix flake * Mock search of estimator * Fixed path of run history test_api * Addressed comments from Fransisco, making better tests * fix flake * After rebase fix issues * fix flake * Added debug information for API * filtering only successful runs in get_incumbent_results * Address comments from fransisco * Revert changes made to run history assertion in base taks #1257 * fix flake issue * [ADD] Coverage calculation (automl#224) * [ADD] Coverage calculation * [Fix] Flake8 * [fix] rebase artifacts * [Fix] smac reqs * [Fix] Make traditional test robust * [Fix] unit test * [Fix] test_evaluate * [Fix] Try more time for cross validation * Fix mypy post rebase * Fix unit test * [ADD] Pytest schedule (automl#234) * add schedule for pytests workflow * Add ref to development branch * Add scheduled test * update schedule workflow to run on python 3.8 * omit test, examples, workflow from coverage and remove unnecessary code from schedule * Fix call for python3.8 * Fix call for python3.8 (2) * fix code cov call in python 3.8 * Finally fix cov call * [fix] Dropout bug fix (automl#247) * fix dropout bug * fix dropout shape discrepancy * Fix unit test bug * Add tests for dropout shape asper comments from fransisco * Fix flake * Early stop on metric * Enable long run regression Co-authored-by: Ravin Kohli <kohliravin7@gmail.com> Co-authored-by: Ravin Kohli <13005107+ravinkohli@users.noreply.github.com> Co-authored-by: bastiscode <sebastian.walter98@gmail.com> Co-authored-by: nabenabe0928 <shuhei.watanabe.utokyo@gmail.com> Co-authored-by: nabenabe0928 <47781922+nabenabe0928@users.noreply.github.com>

franchuterivera added the Documentation label Mar 8, 2021

franchuterivera mentioned this pull request Mar 18, 2021

[Feat] Better traditional pipeline cutoff time #141

Merged

ravinkohli reviewed Mar 23, 2021

View reviewed changes

franchuterivera changed the title ~~First push of the developer documentation~~ [Doc] First push of the developer documentation Mar 24, 2021

franchuterivera added 3 commits April 7, 2021 10:53

First push of the developer documentation

5513d52

Feedback from Ravin

a82300c

Document scikit-learn develop guide

9851b1b

franchuterivera force-pushed the refactor_development_devdoc branch from 27b7e07 to 9851b1b Compare April 7, 2021 09:05

franchuterivera requested a review from ravinkohli April 7, 2021 09:05

ravinkohli reviewed Apr 7, 2021

View reviewed changes

ravinkohli requested changes Apr 7, 2021

View reviewed changes

Feedback from Ravin

e6e9a5c

franchuterivera requested a review from ravinkohli April 7, 2021 14:58

ravinkohli reviewed Apr 7, 2021

View reviewed changes

ravinkohli approved these changes Apr 7, 2021

View reviewed changes

Delete extra point

6c33465

ravinkohli merged commit a4e08e2 into automl:refactor_development Apr 15, 2021

github-actions bot pushed a commit that referenced this pull request Apr 15, 2021

Francisco Rivera Valverde: [Doc] First push of the developer document…

dfcd7d9

…ation (#127)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Doc] First push of the developer documentation #127

[Doc] First push of the developer documentation #127

franchuterivera commented Mar 8, 2021

franchuterivera commented Mar 18, 2021

ravinkohli Mar 23, 2021

ravinkohli Apr 7, 2021

ravinkohli Apr 7, 2021

ravinkohli Apr 7, 2021

ravinkohli Apr 7, 2021

ravinkohli Apr 7, 2021

ravinkohli Apr 7, 2021

ravinkohli Apr 7, 2021

ravinkohli Apr 7, 2021

ravinkohli Apr 7, 2021 •

edited

ravinkohli Apr 7, 2021

ravinkohli left a comment

ravinkohli Apr 7, 2021

ravinkohli left a comment


		The Scikit-learn pipeline inherits from the `BaseEstimator <https://scikit-learn.org/stable/modules/generated/sklearn.base.BaseEstimator.html>`_, which implies that we have to honor the `Scikit-Learn development Guidelines <https://scikit-learn.org/stable/developers/develop.html>`_. Of particular interest is that any estimator must define as attributes, the arguments that the class constructor receives (see `get_params and set_params` from the above documentation).

		Regarding multiprocessing, AutoPyTorch and SMAC work with `Dask.distributed <https://distributed.dask.org/en/latest/>`_. We only submits jobs to Dask up to the number of

[Doc] First push of the developer documentation #127

[Doc] First push of the developer documentation #127

Conversation

franchuterivera commented Mar 8, 2021

franchuterivera commented Mar 18, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ravinkohli Apr 7, 2021 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ravinkohli left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ravinkohli left a comment

Choose a reason for hiding this comment

ravinkohli Apr 7, 2021 •

edited