-
Notifications
You must be signed in to change notification settings - Fork 542
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Error: literal carriage return found in data #523
Comments
The intended way to run the whole corpus of signalmedia-1m, is to put the directory under |
@netj |
@netj Ps: thanks to your feedback, I understand that |
Yes, I confirmed there seems to be issues with the existing cat "$corpus" |
#grep -E 'wife|husband|married' |
#head -100 |
jq -r '[.id, .content] | @tsv' |
# take care of carriage returns
sed 's/\\r//g' |
Hi all,
I got error below when doing "deepdive do sentences" in quickstart example ("has spouse" example) with full dataset from signalmedia (1 million records):
Content of data from document 1060ad64-521f-46c7-a804-4181d97f9bf0 is:
Google around, I see that it's an error in copy command of postgres. I have some questions:
1/ Is there anyway to pre-process data to prevent this bug happen again?
2/ Is there any known bug like this, so that we can collect and create a specific pre-process step to prevent all of them once?
3/ Deepdive has any mechanism to log these errors + skip them in order to continue to run?
The text was updated successfully, but these errors were encountered: