Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nouns #3

Open
molliem opened this issue Apr 18, 2018 · 9 comments
Open

Nouns #3

molliem opened this issue Apr 18, 2018 · 9 comments
Labels
code help wanted Extra attention is needed mozsprint

Comments

@molliem
Copy link
Collaborator

molliem commented Apr 18, 2018

Letters for women are more likely to use adjectives instead of nouns

@molliem
Copy link
Collaborator Author

molliem commented May 3, 2018

Goal: Develop code that can read text for the presence of nouns that highlight roles/positions (like leader, researcher). If position nouns are absent, return a summary statement that directs the author to consider using nouns to strengthen the letter.

This one can be complicated. The goal is to differentiate between descriptions that use adjectives, verbs, or weaken the position noun (i.e., she was involved in research, she taught).

@molliem molliem added help wanted Extra attention is needed mozsprint code labels May 3, 2018
@vassiki
Copy link
Contributor

vassiki commented May 5, 2018

This project sounds amazing, congrats on pitching such a cool project. I'm going to start putting together a script to track frequency of nouns and adjectives for different letters, and potentially play with sentiment analysis to figure out if letters for males show a stronger positive sentiment indicating a higher use of superlatives. I do most of my coding in python, is that going to be a problem?

@cmd16
Copy link
Collaborator

cmd16 commented May 9, 2018

I am also interested in working on this problem. One way to approach this is to make a list of relevant nouns and their corresponding verbs and check the relative frequencies of these (e.g., if "leads" or "led" is used more than "leader"). That seems like a fairly simple first step and I could work on that. It's probably also a good idea to use POS tagging to detect passive voice, as that would catch things like "was involved".

What programming language are we using? I'm most comfortable with Python, though I've done some coding in Perl (I know other languages, but I don't think any of them would be good for this sort of problem). What sort of POS tagging would we use? I've used TreeTagger for Python and Lingua::EN::Tagger for Perl, but I know Python's nltk has several POS taggers built in. I've also used Spacy a little bit, but I'm less familiar with that.

@molliem
Copy link
Collaborator Author

molliem commented May 9, 2018

Python is prefect! That is the language I know best! I'm still learning programming, so that isn't saying much! I am working on setting up a website (www.biascorrect.com). Hoping to have that ready to go by Thursday.

Feel free to use the POS tagging you are most comfortable with! Please remember to add your names to the contributors page as well. I want to be certain to recognize all the contributions.

@molliem
Copy link
Collaborator Author

molliem commented May 9, 2018

Thank you both for the kind words, support, and help!

@j6k4m8
Copy link
Member

j6k4m8 commented May 16, 2018

@molliem — love love love this project, and looking forward to helping!

I did a quick search of the repo and it doesn't look like anyone's mentioned proselint here. This is a general prose-checking framework (tips like weasel_words.very: don't use the word 'very', or typography.symbols.curly_quotes Use curly quotes “”, not straight quotes "".), and the needs of this project reminded me of proselint's plugin-based architecture.

In short, each 'plugin' has its own rules, ways of checking, and error messages — and each is completely independent of the others. So a adjectives_vs_nouns plugin can use a totally different technology to check for bias than stereotypes plugin.

Thought I'd drop the link here in case it's a useful reference, but in the meantime, looking forward to getting started wherever is most helpful!

@molliem
Copy link
Collaborator Author

molliem commented May 17, 2018

@j6k4m8 Thank you for the kind words! And thanks for the link to proselint!! I hadn't heard of it and it is a fabulous reference!

Although we haven't been tackling this project as plugins, our approach feels similar. I divided the issues up into topics and people have been working on scripts for each topic. My plan at the end is to use a wrapper or a for loop to combine the separate topics into one.

Do you know python? There are four issues that no one has tackled: superlatives, family life, minimal assurance, and raises doubt. Help on any of those would be great. If you know web design, I could use some help there too. It is pretty plain.

Thanks for reaching out! Excited to have you join the team!

@j6k4m8
Copy link
Member

j6k4m8 commented May 17, 2018

Python or web-design or both! Up to you, wherever you'd prefer to have more help!

@molliem
Copy link
Collaborator Author

molliem commented May 17, 2018

Amazing! It would be great if you could work on family life, minimal assurance or raises doubt (any of them). My goal was to identify the presence of words and phrases associated with these areas and give feedback, but also to highlight the words in the text box. If you need help with word lists, I can probably tackle that this weekend.

j6k4m8 pushed a commit that referenced this issue Feb 23, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
code help wanted Extra attention is needed mozsprint
Projects
None yet
Development

No branches or pull requests

4 participants