Skip to content

[SYSTEMDS-2972] Transformencode sparse improvements#1383

Closed
ilovemesomeramen wants to merge 8 commits intoapache:masterfrom
ilovemesomeramen:tweeking
Closed

[SYSTEMDS-2972] Transformencode sparse improvements#1383
ilovemesomeramen wants to merge 8 commits intoapache:masterfrom
ilovemesomeramen:tweeking

Conversation

@ilovemesomeramen
Copy link
Copy Markdown
Contributor

This PR introduces the sparse implementations for Binning, Passthrough, Recode and FeatureHashing. Statistics and Debug output.

Further, row partitioning was fully implemented which is controlled by the variables APPLY_ROW_BLOCKS_PER_COLUMN and BUILD_ROW_BLOCKS_PER_COLUMN in the ColumnEncoder class for testing purposes.
The Multithrerading was also enabled for the transformencode instruction.

@phaniarnab phaniarnab closed this in cdff113 Sep 8, 2021
@phaniarnab
Copy link
Copy Markdown
Contributor

Thanks for the changes @ilovemesomeramen.
I added a new flag, parallel.encode, in the config file to programmatically enable/disable multithreaded transforms. Other than that, I fixed a few syntax issues before merging.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants