-
Notifications
You must be signed in to change notification settings - Fork 40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Url-Categorization #5
Comments
Hey, thank you for your message. I think the problem is that URL categorization dataset file is corrupted due to GitHub LFS restrictions. I would suggest you to download an original dataset from https://data.world/crowdflower/url-categorization website. Let me know if the issue still remains. |
I have downloaded the dataset mentioned in the link below you send and also
set dataset path in config.py file then too it is showing the error
main_category_index not found.
…On Tue, 26 Jan 2021, 01:31 Domantas Meidus, ***@***.***> wrote:
Hey, thank you for your message.
I think the problem is that URL categorization dataset file is corrupted
due to GitHub LFS restrictions. I would suggest you to download an original
dataset from https://data.world/crowdflower/url-categorization website.
Let me know if the issue still remains.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#5 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AOX63N3VVTPXCHPQH4ZYU4LS3XE2JANCNFSM4WQVP5WQ>
.
|
Updated code with the fixed solution. The problem was that data.world provides two datasets with the same set of data but with different columns names:
In my previous code 2 option dataset was used so if you used 1 option, then the error may occur on different column naming. Thank you for reporting the problem and let me know if the issue is still remains unsolved or you have any other problems in term of executing this code! Cheers! |
It worked.
…On Thu, 28 Jan 2021 at 9:19 PM, Domantas Meidus ***@***.***> wrote:
Updated code with the fixed solution. The problem was that data.world
provides two datasets with the same set of data but with different columns
names:
1. original/URL-categorization-DFE.csv (column:
'main_category:confidence')
2. data/url_categorization_dfe.csv (column: 'main_category_confidence')
In my previous code 2 option dataset was used so if you used 1 option,
then the error may occur on different column naming.
Thank you for reporting the problem and let me know if the issue is still
remains unsolved or you have any other problems in term of executing this
code!
Cheers!
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#5 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AOX63N653OT4JR4J6O3NSU3S4GBRVANCNFSM4WQVP5WQ>
.
|
After running the 01_construct_features.py getting an error "['main_category_confidence'] not in index"
The text was updated successfully, but these errors were encountered: