-
Notifications
You must be signed in to change notification settings - Fork 272
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
some files not found, when execute auto_phrase.sh #2
Comments
May I ask what the language is your data set? Or can you provide a sample set for us to reproduce the results? |
I just use the Default Run as the README writes which will download DBLP.txt.gz from "http://dmserv2.cs.illinois.edu/data/DBLP.txt.gz". Maybe something goes wrong with the downloaded file. I will check it first. |
It's quite strange. According to your reported exception, the bash script got stuck at the very end of the job (Generating Output stage, line 108). I have rerun the current repository on our Linux machine but couldn't see this problem. Could you paste the complete log? |
Here is the output: real 2m41.072s -ne Current step: Tokenizing stopword file... Picked up JAVA_TOOL_OPTIONS: -Duser.language=en ERROR: while reading string from binary file ERROR: while reading string from binary file ERROR: while reading string from binary file ERROR: while reading string from binary file ERROR: while reading string from binary file ERROR: while reading string from binary file ERROR: while reading string from binary file ERROR: while reading string from binary file ERROR: while reading string from binary file real 0m10.902s I download the DBLP.txt.gz by hand, and put it in data folder. And first several lines are: |
The dataset should be correct. |
If it is linux, I need your kernel output by running
|
I am using mac os sierra, |
FYI, I have tested on a notebook with sierra (16.3.0) installed. After installing gcc6 and java8, the script runs correctly. Please consider reinstalling gcc and java following the updated instructions g++ 6 After installation, you should have the following versions for gcc and java respectively. |
when I execute auto_phrase.sh, some Exceptions are founded:
java.io.FileNotFoundException: tmp/final_quality_multi-words.txt
java.io.FileNotFoundException: tmp/final_quality_unigrams.txt
java.io.FileNotFoundException: tmp/final_quality_salient.txt
According to the error information, something wrong was located
at Tokenizer.tokenizeText(Tokenizer.java:618)
at Tokenizer.main(Tokenizer.java:766)
so, how to resolve this?
The text was updated successfully, but these errors were encountered: