① Determine data structure itself #5

tatianamac · 2019-10-21T00:31:55Z

determine the structure of the data itself. I get the impression that there's a few dictionary features that are still being specced, so this sub-task is probably best left to @tatianamac for the moment.

Currently, each word has:

Flag: Use/Avoid
Label: Ableist/Slur
Definition
Part of speech
Benefits/issues (depending on use/avoid)
Impact
Alt words

Future features will include:

Literature
Attribution/authorship
Nuance or alternate opinions (aside: One of the challenges of this project will be that identity is nuanced; two nonbinary people will feel differently about the same definition, so we'll need to find tactful and considerate ways to approach this through the structure of the definitions.
Link string (see URL wishes here.

Possible features could include:

Upvoting/downvoting? I am not sure how helpful this really is, but something I've thought about.

The text was updated successfully, but these errors were encountered:

tatianamac · 2019-10-21T00:43:24Z

@good-idea Here is the structure I'm thinking. Per your question, I think that infrastructurally, it's probably not much different than a standard online dictionary. However, as I have future hopes for this to integrate into Twitter/Slack bots and the API, I'm not sure how that impacts how we conceive each word's individual structure accordingly.

good-idea · 2019-10-21T01:31:46Z

This all looks good and makes a lot of sense. Some more Q's to discuss:

Should words support multiple definitions/contexts? (i.e. my 'crazy' example in the other thread - each definition would imply a different set of alternatives)
If so, maybe the definitions should be should include the flag, part of speech, benefits, issues, and impacts?

There might be some terms that should be avoided in some circumstances but can be used in a non-harmful way in others. Perhaps:

"That person is homeless" - harmful/avoid
"Los Angeles' homeless population is rising" - not harmful*
- *I'm not sure how I actually feel about saying this is "not harmful", but, just bringing it up as an example

I think breaking the word down into its various definitions could give us more flexibility when it comes to dealing with nuances.

good-idea · 2019-10-21T01:44:04Z

@lynncyrin following up from the other issue:

would recommend against picking technologies at this point

Totally agree. I mention GraphQL because it pretty much takes defining a data structure as its starting point, and makes all of the data types and relationships explicit. Even if we don't end up using GraphQL, putting all of our decisions into a schema definition will make sure we're all on the same page and the structure is clear and makes sense structurally.

Based on Tatiana's list above + breaking words down into multiple definitions, a starting point could be:

type Word {
   word: String!
   definitions: [Definition!]!
   linkString: String!
}

enum PartOfSpeech {
  NOUN
  ADJECTIVE
  ..etc
}

type Definition {
  benefits: [String]
  issues: [String]
  impact: [String]
  partOfSpeech: PartOfSpeech!
  # Flag to 'avoid'. Or, make this an array of flags that are more like tags
  flag: Boolean
  labels: [Label]!
  alternatives: [Word]!
}

# Labels for definitions
# i.e. "ableist" / "slur", etc, could include 'positive' labels too, i.e. "inclusive", "gender-neutral"
type Label {
  name: String
  description: String
}

good-idea · 2019-10-21T01:49:50Z

Future features will include:

Literature
Attribution/authorship

These could be pretty easily added now, I think.

Upvoting/downvoting? I am not sure how helpful this really is, but something I've thought about.

I think this is a crucial issue, but might make sense to put in a milestone further down the road. We won't be able to handle this if we're just storing the library as JSON. A JSON library makes a lot of sense for getting off the ground, but it means that only developers can contribute - eventually, anyone should be able to have their say by clicking a button (or something)

good-idea · 2019-11-01T23:10:06Z

Here's a quick example/exploration of how to structure it with JSON files:

https://repl.it/@good_idea/WhimsicalGreenyellowDemos

The challenge is going to be finding the right spot between complexity and simplicity. Language is so fluid that it's hard to really define it in any rigid structure - but since we're doing this in code, there needs to be one. A simpler structure will make it a simpler tool to use, but not account for edge cases. For example:

Simple option: mark Words as "avoid"
- Pro: Covers most use cases, Easy to parse a sentence and find any words that are flagged as "avoid"
- Pro: makes definitions easier to add to the dictionary
- Con: There will be instances where a word is OK to use in one context, but not another. The word savage is a definite "avoid" when referring to a person, but would be acceptable when talking about feral animals or brutality in general, i.e. "a savage criticism"
Complex option: mark Definitions (or contexts) as "avoid"
- Pro: able to more discretely determine when to avoid a word, based on the context.
- Con: Now it's more complicated when using the API or dictionary module. How does the code look at a tweet or slack message and pick out words to avoid? All of the sudden it sounds like it needs some kind of language processing/context recognition.
- Con: adding definitions to the dictionary is now more complex

🤔

In terms of the overall project, it might be better to take a simpler approach at first. If the "flagship" part of the project is the /me/my+terms pages, then this reduces the complexity needed. The flag would indicate "avoid using this term to define a person or thing" instead of just "avoid using this term (but maybe not in different circumstances)".

But, this would mean that a slackbot (or other scripts/plugins/bots) would be more limited.

mjoynes-wombat-web · 2020-08-29T23:21:52Z

I've started mapping out the data structure for if we were to use MySQL as the database for my example of using MySQL, Elastic Search and a serverless function for the API.

https://github.com/ssmith-wombatweb/api/blob/local-test/elasticsearch-mysql/test-api-environment/elasticsearch-mysql/MySQL%20Data%20Structure.md

tatianamac transferred this issue from selfdefined/web-app Aug 29, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

① Determine data structure itself #5

① Determine data structure itself #5

tatianamac commented Oct 21, 2019

tatianamac commented Oct 21, 2019

good-idea commented Oct 21, 2019

good-idea commented Oct 21, 2019

good-idea commented Oct 21, 2019

good-idea commented Nov 1, 2019

mjoynes-wombat-web commented Aug 29, 2020

① Determine data structure itself #5

① Determine data structure itself #5

Comments

tatianamac commented Oct 21, 2019

tatianamac commented Oct 21, 2019

good-idea commented Oct 21, 2019

good-idea commented Oct 21, 2019

good-idea commented Oct 21, 2019

good-idea commented Nov 1, 2019

mjoynes-wombat-web commented Aug 29, 2020