Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Configurable domain preference #236

Merged
merged 23 commits into from
May 25, 2022
Merged

Configurable domain preference #236

merged 23 commits into from
May 25, 2022

Conversation

darabos
Copy link
Contributor

@darabos darabos commented May 24, 2022

This is from a more experimental repo. But even on normal LynxKite it's useful to be able to turn on/off any domain and change their preference when experimenting with something.

darabos added 23 commits May 24, 2022 14:08
(For benchmarking and other experimentation.)
The original code was very fancy to avoid reading the input multiple
times. This was a concern, for example with large CSVs. But now the
input is always a Table, which is backed directly by a Parquet file in
LynxKite storage. Parquet is column-oriented, so we can just read the
columns one by one.
I've never even heard about it.
Use Parquet instead of Sequence files for entities
@darabos darabos merged commit 2338131 into main May 25, 2022
@darabos darabos deleted the darabos-upstreaming branch May 25, 2022 07:22
darabos added a commit that referenced this pull request May 25, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant