nlp subject Exercise 7 audit results fail to follow task instructions #2349

jarmo-seljamaa · 2023-12-10T08:17:38Z

nlp

We struggled to get the results as described in the audit of Exercise 7. After a lot of trial and error we discovered that we get the exact audit results ONLY when we ignore the task instructions.

The instructions say:

Steps:

Preprocess the data using the function implemented earlier. Then, use CountVectorizer from scikit-learn with max_features=500 to compute the word count of the tweets. The output is a sparse matrix.

We could get the expected audit results for Question 1, 2 and 3 when we skip preprocessing completely and feed the raw data from the given file into the CountVectorizer. Most probably this is not the intended solution. Please re-check the audit.

nprimo · 2023-12-18T10:43:33Z

Hi @jarmo-seljamaa, thank you for the feedback! The subject and audit are going to be updated in the following days to resolve this issue.

nprimo self-assigned this Dec 11, 2023

nprimo linked a pull request Dec 18, 2023 that will close this issue

CON-2329-tackle-gh-issue-nlp-exercise-7 #2360

Merged

nprimo closed this as completed in #2360 Dec 19, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

nlp subject Exercise 7 audit results fail to follow task instructions #2349

nlp subject Exercise 7 audit results fail to follow task instructions #2349

jarmo-seljamaa commented Dec 10, 2023

nprimo commented Dec 18, 2023

nlp subject Exercise 7 audit results fail to follow task instructions #2349

nlp subject Exercise 7 audit results fail to follow task instructions #2349

Comments

jarmo-seljamaa commented Dec 10, 2023

nlp

nprimo commented Dec 18, 2023