New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dynamic dataset #364

Merged
merged 44 commits into from Sep 14, 2017

Conversation

Projects
None yet
2 participants
@jsenellart
Member

jsenellart commented Aug 30, 2017

Create dynamic structure to perform preprocess/weighted sampling on the fly on a directory of training corpus

@codecov-io

This comment has been minimized.

Show comment
Hide comment
@codecov-io

codecov-io Aug 30, 2017

Codecov Report

Merging #364 into master will increase coverage by 0.55%.
The diff coverage is 83.4%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #364      +/-   ##
==========================================
+ Coverage   70.53%   71.09%   +0.55%     
==========================================
  Files          71       74       +3     
  Lines        5980     6459     +479     
==========================================
+ Hits         4218     4592     +374     
- Misses       1762     1867     +105
Impacted Files Coverage Δ
onmt/data/SampledDataset.lua 70.04% <100%> (-2.29%) ⬇️
onmt/data/init.lua 100% <100%> (ø) ⬆️
onmt/data/DynamicDataset.lua 16.66% <16.66%> (ø)
onmt/data/DynamicDataRepository.lua 37.5% <37.5%> (ø)
onmt/utils/Error.lua 88.23% <66.66%> (ø) ⬆️
onmt/utils/FileReader.lua 89.33% <84.21%> (-6.02%) ⬇️
onmt/data/SampledVocabDataset.lua 85.18% <85.18%> (ø)
onmt/data/Preprocessor.lua 93.21% <91.56%> (-3.91%) ⬇️
onmt/utils/ExtendedCmdLine.lua 64.54% <0%> (-1.67%) ⬇️
onmt/utils/Features.lua 96.77% <0%> (-1.62%) ⬇️
... and 4 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 9267186...4d2b2c6. Read the comment docs.

codecov-io commented Aug 30, 2017

Codecov Report

Merging #364 into master will increase coverage by 0.55%.
The diff coverage is 83.4%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #364      +/-   ##
==========================================
+ Coverage   70.53%   71.09%   +0.55%     
==========================================
  Files          71       74       +3     
  Lines        5980     6459     +479     
==========================================
+ Hits         4218     4592     +374     
- Misses       1762     1867     +105
Impacted Files Coverage Δ
onmt/data/SampledDataset.lua 70.04% <100%> (-2.29%) ⬇️
onmt/data/init.lua 100% <100%> (ø) ⬆️
onmt/data/DynamicDataset.lua 16.66% <16.66%> (ø)
onmt/data/DynamicDataRepository.lua 37.5% <37.5%> (ø)
onmt/utils/Error.lua 88.23% <66.66%> (ø) ⬆️
onmt/utils/FileReader.lua 89.33% <84.21%> (-6.02%) ⬇️
onmt/data/SampledVocabDataset.lua 85.18% <85.18%> (ø)
onmt/data/Preprocessor.lua 93.21% <91.56%> (-3.91%) ⬇️
onmt/utils/ExtendedCmdLine.lua 64.54% <0%> (-1.67%) ⬇️
onmt/utils/Features.lua 96.77% <0%> (-1.62%) ⬇️
... and 4 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 9267186...4d2b2c6. Read the comment docs.

@jsenellart jsenellart changed the title from [WIP] Global dataset to [WIP] Dynamic dataset Aug 31, 2017

@jsenellart jsenellart changed the title from [WIP] Dynamic dataset to Dynamic dataset Sep 14, 2017

@jsenellart jsenellart merged commit 1a081bc into OpenNMT:master Sep 14, 2017

1 check passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details

@jsenellart jsenellart deleted the jsenellart:GlobalDataset branch Sep 14, 2017

guillaumekln added a commit that referenced this pull request Oct 5, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment