Skip to content
This repository has been archived by the owner on Nov 22, 2022. It is now read-only.

fix MultipleData by making tensorizers able to initialize from multiple data sources #972

Closed
wants to merge 2 commits into from

Conversation

liaimi
Copy link

@liaimi liaimi commented Sep 11, 2019

Summary: For the newly added Data that could read from multiple data sources, there are issues when initializing tensorizers, tensorizers will only be initialized with the last data source, this diff makes tensorizers with vocab able to initiate from multiple data sources.

Differential Revision: D17301822

@facebook-github-bot facebook-github-bot added the CLA Signed Do not delete this pull request or issue due to inactivity. label Sep 11, 2019
rutyrinott and others added 2 commits September 13, 2019 12:48
Summary:
Pull Request resolved: facebookresearch#953

Add data object that allows combining multiple data types during training and evaluation.
Useful when training data is split across different files/partitions, and user wants to explore different combinations

Reviewed By: borguz

Differential Revision: D17143117

fbshipit-source-id: 881e8288359658826600408b72cf3e6887089630
…le data sources (facebookresearch#972)

Summary:
Pull Request resolved: facebookresearch#972

For the newly added Data that could read from multiple data sources, there are issues when initializing tensorizers, tensorizers will only be initialized with the last data source, this diff makes tensorizers with vocab able to initiate from multiple data sources by introducing a new parameter in tensorizer initialize. When from_scratch is set to False for Data, it allows tensorizer to accumulate vocab from multiple data sources. I modified each tensorizer accordingly depending on its implementation, the basic change is to not create a new vocab builder when from_scratch is False.

Reviewed By: rutyrinott

Differential Revision: D17301822

fbshipit-source-id: 041d0bc6aeff5e0ff690f6d43df416d898357f57
@facebook-github-bot
Copy link
Contributor

This pull request has been merged in 1e4dd71.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
CLA Signed Do not delete this pull request or issue due to inactivity. Merged
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants