Skip to content

Generate feature column eagerly#10

Merged
simba-git merged 18 commits intodevelopfrom
feature/simba/training-api
Jun 24, 2020
Merged

Generate feature column eagerly#10
simba-git merged 18 commits intodevelopfrom
feature/simba/training-api

Conversation

@simba-git
Copy link

When generating a training dataset, we needs all the values in the
feature column. We used to only generate the value at lookup time;
however, it's easier to eagerly generate the column to be able to use
it natively in pandas.

Simba added 18 commits June 22, 2020 11:55
When generating a training dataset, we needs all the values in the
feature column. We used to only generate the value at lookup time;
however, it's easier to eagerly generate the column to be able to use
it natively in pandas.
Table indices are casted to be strings. This is a reasonable default
for our current use cases, except when there is no index column. In
this case, the row number should be used as the index (And not casted
as a string).
The Column should have the same name as the feature.
This avoids manual importing of every error in errors.py.
This clarifies that no column is being set to the primary key.
entity_values is an array of hte value of each entity not the value
of each feature.
Training sets can be configured by providing labels, features, and
entity mappings. The entities mapping is used to get the actual value
of each feature per label.
Adds two tests, one to check entity mapping across two CSVs and one
where the features are in the same file as the labels.
It's cleaner to not have any characters in the version.
This allows python3 setup.py sdist to behave properly
Thist dist/ directory is generated when pushing a new version to pypi.
It was missing a comma.
Name should only be changed via the rename method. This removes any
confusion about this.
This simplifies the API and expectations. Currently, renaming only
happen when a column is transformed.
This adds a series of simple tests for a column containing all integers
between 1-100 (inclusive).
@simba-git simba-git force-pushed the feature/simba/training-api branch from 8b46a59 to 582f1b2 Compare June 24, 2020 02:04
@simba-git simba-git merged commit 3443fcd into develop Jun 24, 2020
@simba-git simba-git deleted the feature/simba/training-api branch June 24, 2020 02:21
@simba-git simba-git added this to the v0.0.0a1 milestone Jun 24, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant