Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ValueError: empty range for randrange() (0,0, 0) #34

Open
genbei opened this issue Jun 3, 2021 · 4 comments
Open

ValueError: empty range for randrange() (0,0, 0) #34

genbei opened this issue Jun 3, 2021 · 4 comments

Comments

@genbei
Copy link

genbei commented Jun 3, 2021

I have processed the data according to the data format you said,Here are my running scripts and errors

python code/augment.py --input=train_50w.en --output=train_50w._augmented.txt --num_aug=1 --alpha_sr=0.05 --alpha_rd=0.05 --alpha_ri=0 --alpha_rs=0.05

Traceback (most recent call last):
File "code/augment.py", line 75, in
gen_eda(args.input, output, alpha_sr=alpha_sr, alpha_ri=alpha_ri, alpha_rs=alpha_rs, alpha_rd=alpha_rd, num_aug=num_aug)
File "code/augment.py", line 64, in gen_eda
aug_sentences = eda(sentence, alpha_sr=alpha_sr, alpha_ri=alpha_ri, alpha_rs=alpha_rs, p_rd=alpha_rd, num_aug=num_aug)
File "/home/tool/eda_nlp-master/code/eda.py", line 201, in eda
a_words = random_swap(words, n_rs)
File "/home/tool/eda_nlp-master/code/eda.py", line 130, in random_swap
new_words = swap_word(new_words)
File "/home/tool/eda_nlp-master/code/eda.py", line 134, in swap_word
random_idx_1 = random.randint(0, len(new_words)-1)
File "/home/miniconda3/envs/eda/lib/python3.6/random.py", line 221, in randint
return self.randrange(a, b+1)
File "/home/miniconda3/envs/eda/lib/python3.6/random.py", line 199, in randrange
raise ValueError("empty range for randrange() (%d,%d, %d)" % (istart, istop, width))
ValueError: empty range for randrange() (0,0, 0)

@bergr7
Copy link

bergr7 commented Jun 8, 2021

same issue here:

python code/augment.py --input=data/train_original.txt --num_aug=15 --alpha_sr=0.1

Traceback (most recent call last):
File "code/augment.py", line 75, in
gen_eda(args.input, output, alpha_sr=alpha_sr, alpha_ri=alpha_ri, alpha_rs=alpha_rs, alpha_rd=alpha_rd, num_aug=num_aug)
File "code/augment.py", line 64, in gen_eda
aug_sentences = eda(sentence, alpha_sr=alpha_sr, alpha_ri=alpha_ri, alpha_rs=alpha_rs, p_rd=alpha_rd, num_aug=num_aug)
File "/Users/bernardogarcia/GitHub/eda_nlp/code/eda.py", line 194, in eda
a_words = random_insertion(words, n_ri)
File "/Users/bernardogarcia/GitHub/eda_nlp/code/eda.py", line 153, in random_insertion
add_word(new_words)
File "/Users/bernardogarcia/GitHub/eda_nlp/code/eda.py", line 160, in add_word
random_word = new_words[random.randint(0, len(new_words)-1)]
File "/Users/bernardogarcia/opt/anaconda3/envs/nlp-news_filter/lib/python3.7/random.py", line 222, in randint
return self.randrange(a, b+1)
File "/Users/bernardogarcia/opt/anaconda3/envs/nlp-news_filter/lib/python3.7/random.py", line 200, in randrange
raise ValueError("empty range for randrange() (%d,%d, %d)" % (istart, istop, width))
ValueError: empty range for randrange() (0,0, 0)

@msub0310
Copy link

msub0310 commented Dec 4, 2021

In code/eda.py, the main function eda starts with below in line 175

sentence = get_only_chars(sentence)

get_only_chars function performs preprocessing to remove non-alphabetic characters from text. Therefore, if you input text data consisting of only non-alphabetic characters, len(words)-1 becomes -1 in the code below in line 117 and etc. and an error occurs.

rand_int = random.randint(0, len(words)-1)

If you input text data that consists only of non-alphabetic characters, you can avoid this error by modifying the get_only_chars function in line 45, so that the data is excluded from removal as follows.

if char in 'qwertyuiopasdfghjklzxcvbnm123456789(). ':

@zhoujiangfeng
Copy link

i have same problem,big probability is your data problem. if you are sentence is null or particular token ,like "------".
check your data.

@shakiba-bakhtiari
Copy link

I also have this problem, could this problem refers to dataset?
because my dataset has three columns but the dataset in this repository has two columns.
would you please help me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants