Add pos_proportions#6
Conversation
There was a problem hiding this comment.
Good job! You solved that easily :) I've made a couple of suggestions on how to take better advantage of Counter and make it a bit more conscise.
Before I merge I'd like you to change a few things:
Make a new pipeline component for pos_proportions. None of the other functions in descriptive_stats need a pos-tagger which is nice if you just want to calculate descriptive statistics fast (without running the text through a model first). This is probably also the reason why the test fails: only an empty spacy pipeline is loaded for the descriptive_stats tests. So, to keep it this way, make a new script in the components folder where you implement the pos_proportions method and make it play with the rest of the package.
This requires a bit of fiddling with init.py folders and some other stuff. Reach out of you have issues!
Make PR more pythonic Co-authored-by: HLasse <lasseh0310@gmail.com>
|
Ah, makes a ton of sense! I've moved CodeFactor is flagging an issue with dataframe_extract. It probably isn't due to my commit, so I'll leave it be for now. Let me know if you'd like more changes! |
|
looks good! let's keep the methods simple for now - we can always add complexity when required. If you update the docs then I'll merge |
|
i tried to be smart and do the review in githubs online editor which somehow made me prematurely merge into main.. I'll clean some of the code if you go ahead with docs |
Here goes!
The function runs fine separate from the package:
However, the test fails with:
I wager that's because I've not implemented the function correctly in the package somewhere, and would love a hand with that :-)