Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Partial column types #776

Merged
merged 17 commits into from
Sep 20, 2019
Merged

Partial column types #776

merged 17 commits into from
Sep 20, 2019

Conversation

yufei-12
Copy link
Contributor

This pull request adds the function of supporting partial column_types provided by user.

The pull request is ready to be merged.

@@ -8,6 +8,7 @@
from sklearn.feature_extraction import text
from tensorflow.python.util import nest

import autokeras as ak
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove this line

@@ -129,6 +129,26 @@ def map_func(x):
assert isinstance(new_dataset, tf.data.Dataset)


column_names_for_tests = [
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

avoid the duplication

@@ -218,6 +218,11 @@ def assemble(self, input_node):
self.infer_column_types()
if input_node.column_types is None:
input_node.column_types = self.column_types
# partial column_types is provided
elif len(input_node.column_types) < len(input_node.column_names):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need this if. Just directly do the for loop.

@haifeng-jin haifeng-jin merged commit 6db6146 into feature_engineering Sep 20, 2019
@haifeng-jin haifeng-jin deleted the partial_column_types branch September 20, 2019 21:58
haifeng-jin pushed a commit that referenced this pull request Sep 24, 2019
* structured data column type detection

* Igbm

* lgbm

* fix travis-ci issue

* temp commit

* weights

* modified the preprocessors for get set weights config

* typo

* lgbm

* Update preprocessor.py

* clear weights and docstrings

* d

* typo

* b

* put clear and save preprocessors in on_trial_end

* Update setup.py

* Update head.py

* b

* Update preprocessor.py

* fixing bug of not clearing weights for first build

* implement feature engineering

* doc

* Update preprocessor.py

* doc

* Update block.py

* Update preprocessor.py

* Update test_preprocessor.py

* Update hyperblock.py

* Update hyperblock.py

* block

* implement feature engineering

* regressor

* Update preprocessor.py

* Update test_preprocessor.py

* hyper

* extract the base class out.

* doc

* doc

* implement feature engineering

* suppress warning for lightgbm

* refining docstrings

* delete redundant import

* bug fix

* potential bug fix

* bug fix

* bug fix

* add new import to __init__.py

* Update auto_model.py

* bug fix

* bug fix of transform that has test data with new categories

* add structured data regressor test

* add transform new data test

* add structured data task detection to meta_model

* fix typo

* structured data blocks

* fixing bugs

* bug fix

* style change

* bug fix

* docstrings

* extract structured_data() to common.py

* extract structured_data() to common.py

* variable and function name changed

* fix typo

* fix typo

* fix typo

* bugs found and fixed by api-test (#762)

* bug tested by api

* bug

* Update io_api.py

* bug

* bug

* delete files

* bug fix

* bug fix

* integration tests

* bug

* bugs

* bugs

* bug_fix

* bug

* Create test_functional_api.py

* node bound to transform data

* bug fix

* bug fix

* bug fix

* style fix

* bug fix

* Update task_api.py

* Csv (#774)

* support structured data from csv

* Fix copypasta'd default names for AutoModels (#773)

* support data from csv files

* support structured data from csv

* Fix copypasta'd default names for AutoModels (#773)

* support data from csv files

* rebase

* bug fix

* complete StructuredDataInput.fit and reduce duplication

* delete print

* split graph hyper model and hyper built

* graph compile

* set input node for fe by compile

* separate structured blocks from heads

* addressing comments

* bug fix

* Partial column types (#776)

* support structured data from csv

* Fix copypasta'd default names for AutoModels (#773)

* support data from csv files

* support structured data from csv

* Fix copypasta'd default names for AutoModels (#773)

* support data from csv files

* rebase

* bug fix

* complete StructuredDataInput.fit and reduce duplication

* delete print

* support partial column_types

* bug fix

* addressing comments

* fix typo

* docstrings

* docstrings

* bug fix

* docstrings

* move postprocess to heads

* style changes, tests added

* docstring

* docstrings
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants