Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

issue about data preprocess #14

Closed
shangqing-liu opened this issue Jul 17, 2019 · 2 comments
Closed

issue about data preprocess #14

shangqing-liu opened this issue Jul 17, 2019 · 2 comments

Comments

@shangqing-liu
Copy link

shangqing-liu commented Jul 17, 2019

Hi,
May I ask when I read your codes about reader.py, I doubted the function process_dataset, the sample of the maximum contexts:

safe_limit = tf.cast(tf.maximum(num_contexts_per_example, self.config.MAX_CONTEXTS), tf.int32)
rand_indices = tf.random_shuffle(tf.range(safe_limit))[:self.config.MAX_CONTEXTS]
contexts = tf.gather(all_contexts, rand_indices) # (max_contexts,)

seems will be array out of bounds.

@urialon
Copy link
Contributor

urialon commented Jul 17, 2019

Hi,
There is no array out of bound, because there is padding. Even when there are less than max_contexts, there are paddings instead.

Best,
Uri

@shangqing-liu
Copy link
Author

got it, thanks very much @urialon

@urialon urialon closed this as completed Jul 30, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants