Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce new parameter to control number of batches in negative edge sampling #25

Closed
einbandi opened this issue Jul 21, 2022 · 2 comments

Comments

@einbandi
Copy link
Collaborator

The parameters n_epochs and batch_size cannot fully define the number of batches in negative edge sampling. Since the negative edge sampling uses a weighted random sampler without replacement, it cannot be guaranteed that each item is seen exactly once. Instead, the user must define something like a batches_per_epoch parameter (rethink name, maybe check how it is called in UMAP).

@einbandi
Copy link
Collaborator Author

einbandi commented Aug 1, 2022

This is actually not necessary, it is enough to just set a higher number of epochs. It is now explained in the docstring of the TrainingPhase class how the batch size relates to the number of edges and items in a batch. There, also the different meaning of epoch in the case of the two sampling variants is explained.

@einbandi
Copy link
Collaborator Author

einbandi commented Aug 1, 2022

On second thought, it might be better to have a batches_per_epoch parameter that by default is set in such a way that roughly a number of items/edges is samples per epoch is sampled that is equal to the full dataset (but can be overridden).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant