Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix CharDataset::__len__ off by one error #31

Merged
merged 1 commit into from
Aug 26, 2020

Conversation

fpgaminer
Copy link
Contributor

I made an off by one mistake in my comments on issue #22, which unfortunately got rolled into 339f4e7. Sorry about that!

Verified by example:

>>> data = list(range(8))
>>> len(data)
8
>>> block_size = 4
>>> 
>>> 
>>> for i in range(len(data) - block_size):
...     print(i)
...     print(data[i:i+block_size+1])
... 
0
[0, 1, 2, 3, 4]
1
[1, 2, 3, 4, 5]
2
[2, 3, 4, 5, 6]
3
[3, 4, 5, 6, 7]
>>> for i in range(len(data) - (block_size+1)):
...     print(i)
...     print(data[i:i+block_size+1])
... 
0
[0, 1, 2, 3, 4]
1
[1, 2, 3, 4, 5]
2
[2, 3, 4, 5, 6]

@karpathy
Copy link
Owner

easy way to see this is that when block_size + 1 == len(data) the call to len should return 1. I actually spotted the issue in your PR and then copied it anyway into the code forgot about it, I think it was just too late.

@karpathy karpathy merged commit c436005 into karpathy:master Aug 26, 2020
alpercanberk pushed a commit to alpercanberk/moe-transformer that referenced this pull request Aug 27, 2023
fix CharDataset::__len__ off by one error
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants