Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nlp subject Exercise 7 audit results fail to follow task instructions #2349

Closed
jarmo-seljamaa opened this issue Dec 10, 2023 · 1 comment · Fixed by #2360
Closed

nlp subject Exercise 7 audit results fail to follow task instructions #2349

jarmo-seljamaa opened this issue Dec 10, 2023 · 1 comment · Fixed by #2360
Assignees

Comments

@jarmo-seljamaa
Copy link

nlp

We struggled to get the results as described in the audit of Exercise 7. After a lot of trial and error we discovered that we get the exact audit results ONLY when we ignore the task instructions.

The instructions say:

Steps:

  1. Preprocess the data using the function implemented earlier. Then, use CountVectorizer from scikit-learn with max_features=500 to compute the word count of the tweets. The output is a sparse matrix.

We could get the expected audit results for Question 1, 2 and 3 when we skip preprocessing completely and feed the raw data from the given file into the CountVectorizer. Most probably this is not the intended solution. Please re-check the audit.

@nprimo nprimo self-assigned this Dec 11, 2023
@nprimo nprimo linked a pull request Dec 18, 2023 that will close this issue
@nprimo
Copy link
Contributor

nprimo commented Dec 18, 2023

Hi @jarmo-seljamaa, thank you for the feedback! The subject and audit are going to be updated in the following days to resolve this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants