Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Apply inclusion rules for political entities #312

Merged
merged 3 commits into from
Jun 7, 2020
Merged

Conversation

axelboc
Copy link
Collaborator

@axelboc axelboc commented May 17, 2020

UPDATE - see #312 (comment) and following comments for final inclusion rules and removal stats.


Fixes #306, fixes #221.

Alright, so I've applied the latest inclusion rules discussed in #306. Here's what I've done exactly:

  • Added the spreadsheet with the filtered lists at the root of the repository and renamed it to political-entities.xlsx
  • Created a temp folder inside src with the following:
    • data-temp.csv - contains the data that moves out of the deck (I followed option D suggested by @aplaice in Upgrade process when data is removed (v4 upgrade) #307 without changing the GUIDs yet) -- note that I've added blank lines to group notes that are removed completely (first group) or only partially (second group).
    • media folder - contains the media files that move out of the deck.
    • to-be-added.txt - lists the notes that will need to be added later on to the deck.

The new inclusion rules would lead to the removal, in the standard deck, of:

  • 21 location cards, or 6.5 % of the 321 cards currently in the deck. If we consider the 6 maps that would be added, this goes down to 15 cards or 4.7 %.
  • 47 flag cards, or 18.1 % of the 260 cards currently in the deck.
  • 104 capitals cards (= 2 templates * 52 capitals), or 19.9 % of the 524 cards currently in the deck.

The net result would the removal of 165 cards, or 14.9 % of the 1105 cards currently in the deck.


I have to admit that this is a way more significant cull than I had envisioned, sorry... TBH, I don't think removing more than 5~8 % of the deck is wise.

FWIW, I really like the result for autonomous islands, transcontinental areas and exclaves - i.e. Ceuta, Saba, etc. being removed completely, and Corsica, Sicily, Galápagos Islands, etc. having their capitals and flags removed.

However, I think it would be wiser to rethink the rules for dependent territories. Here is one suggestion:

  • include all of them with maps;
  • apply the OR rule for inclusion with capitals/flags (potentially tweaking the area/population thresholds);
  • remove the now-obsolete AND rule.

I think the above brings some nice benefits (i.e. exhaustiveness, while still removing some of the most obscure capitals and flags), and would not have such a strong impact on the deck.

What are your thoughts?

@aplaice
Copy link
Collaborator

aplaice commented May 17, 2020

Wow!

I understand that we'll wait with the (probable) full removal of entities like Mount Athos or the (probable) removal of flag/capital from Bali, for a future analysis of non-island autonomous countries and of non-autonomous islands (as physical entities)?


I have to admit that this is a way more significant cull than I had envisioned, sorry... TBH, I don't think removing more than 5~8 % of the deck is wise.

Do you mean removing more than 5-8 % of the current cards, or on net? If the latter, then we can easily (and sensibly) fill up the gap with islands (as non-political entities) and potentially regions, deserts, lakes, mountain ranges, non-island autonomous regions or even some more seas. :D

apply the OR rule for inclusion with capitals/flags (potentially tweaking the area/population thresholds);

On a gut level, I care far more about and are far more likely to ever encounter, in real life, the capitals (and even flags) of Sicily, Corsica, Bali (non-autonomous that it is) or Zanzibar than of, say Jersey (which is among the "marginal" dependent territories likely to get the capital/flag), irrespective of their political status. I would guess that that might hold for many people. That isn't to say that I want the capitals of the former kept, but I'm also not enthusiastic about re-adding (keeping) them for the latter group.

include all of them with maps;

I feel slightly less opposed to including all the dependent territories with maps, for exhaustiveness, though now that I look at them, some are absurdly small (e.g. Pitcairn Islands, pop. < 60)...

Hence, my alternative suggestion:

  1. Include all the dependent territories with maps.

  2. Get rid of the now obsolete OR criterion.

However, I'd also be happy with the full cull.


FWIW I haven't yet thoroughly looked at the commit, but on a cursory glance, it looks great implementation-wise!

@axelboc
Copy link
Collaborator Author

axelboc commented May 17, 2020

I understand that we'll wait with the (probable) full removal of entities like Mount Athos or the (probable) removal of flag/capital from Bali, for a future analysis of non-island autonomous countries and of non-autonomous islands (as physical entities)?

Ha ! No, I just forgot about them. 😄 I've removed Mount Athos completely and removed Bali's capital and flag for now. I've also changed Bali's capital info to "Island of Indonesia" and updated the numbers in the PR description.

@axelboc
Copy link
Collaborator Author

axelboc commented May 17, 2020

Do you mean removing more than 5-8 % of the current cards, or on net? If the latter, then we can easily (and sensibly) fill up the gap with islands (as non-political entities) and potentially regions, deserts, lakes, mountain ranges, non-island autonomous regions or even some more seas. :D

I did mean net, but only within the political geography side of the deck. Adding more physical geography notes is not going to happen in the short term, so I prefer not to assume too much for now. 😄 Basically, I'm all for removing a bunch of cards, but I don't want it to be a frustrating experience for our users, you know?

You make good points about some dependent territories being very small in terms of population. I went back to the spreadsheet to play with the criteria a bit, and try to get more sensible results... I think I managed to keep a bit more notes and a bit more capitals and flags, while not going so far as to include maps for every dependent territory:

political-entities.xlsx

Here's an overview:

  • Include dependent territories with map if population >= 15,000 OR area >= 1,000 -- this excludes 10 minor territories but keeps Svalbard and Falkland Islands.
  • Include dependent territories with capital and flag if population >= 100,000 OR (population >= 15,000 AND area >= 1,000) -- this includes 9 highly populated territories, as well as Greenland, Faroe Islands and Åland Islands, but excludes Jersey (118 km2) and Svalbard (2,667 inhabitants) notably.
  • Include autonomous island with map if population >= 100,000 -- I realised that Madeira and Mayotte, which both have around 280,000 inhabitants were being excluded because of their area; I think applying a high population threshold on its own makes more sense.
  • Include transcontinental territories and exclaves with map if population >= 100,000 AND distance >= CLOSE -- similarly, I think removing the area threshold makes sense; note that this removes Socotra Governorate and Galápagos Islands.

What do you think?

@aplaice
Copy link
Collaborator

aplaice commented May 17, 2020

This is great!

Its one disadvantage is obviously that it's more complicated, but I think the thresholds make general sense, and I really like the result! Since the various thresholds will be in the spreadsheet, in the repo, the additional complexity isn't likely to be an issue.

@axelboc axelboc force-pushed the political-entities branch 5 times, most recently from 9d224d8 to e368a5d Compare May 19, 2020 21:01
Copy link
Collaborator Author

@axelboc axelboc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright, I've applied the latest rules as discussed. The new numbers are as follows:

  • 41 capitals removed
  • 38 flags removed
  • 17 maps removed
  • 7 maps added

This gives us a net reduction of 133 cards, or 12.0 % of the deck, divided as such:

  • 82 capital cards (15.7 % of 524)
  • 38 flag cards (14.6 % of 260)
  • 10 map cards (3.1 % of 321)

Although this is still significant, I'm happy with the result. I was pretty much saying "well, I won't be missing you" to every piece of content I moved to data-temp.csv... 😄

Interestingly, most of the political entities that didn't have a capital or a flag to start with -- like Tokelau, Easter Island, or some of the French overseas departments -- are now either fully removed or have had their capital/flag removed. I also removed a bunch of capital hints (e.g. Norfolk Island/Jamaica), which felt very good as well.


I guess the next step would be to have someone else go through the diff to make sure the removals match the spreadsheet and that I didn't forget to remove a hint or something (or removed one by mistake even).

If there are any last-minute objections to the inclusion rules, some entities you think should be brought back in, or removed completely instead of partially, ... or whatever, now is the time!

After that, I think it'll be good to merge this PR (after a bit of tidying, perhaps) so we can keep the temporary folder up to date until we actually release v4.0. There may be changes worth applying to removed entities, notably fixes to the French translation (#293).

src/data.csv Show resolved Hide resolved
src/data.csv Show resolved Hide resolved
src/data.csv Show resolved Hide resolved
src/data.csv Show resolved Hide resolved
src/data.csv Show resolved Hide resolved
@aplaice

This comment has been minimized.

@axelboc axelboc mentioned this pull request May 20, 2020
@aplaice

This comment has been minimized.

@axelboc

This comment has been minimized.

@axelboc axelboc force-pushed the political-entities branch 2 times, most recently from 3027d5d to 094db70 Compare June 3, 2020 19:50
@axelboc axelboc changed the title [WIP] Apply inclusion rules for political entities Apply inclusion rules for political entities Jun 3, 2020
@axelboc
Copy link
Collaborator Author

axelboc commented Jun 3, 2020

I've updated sources.csv, double checked the removed media, moved the temp folder to the root of the repo and renamed it removed .. but most importantly, I've documented the latest inclusion rules in CONTRIBUTING.md. Please do proof-read if you have time! 🙏

CONTRIBUTING.md Outdated Show resolved Hide resolved
CONTRIBUTING.md Show resolved Hide resolved
CONTRIBUTING.md Outdated Show resolved Hide resolved
CONTRIBUTING.md Outdated Show resolved Hide resolved
@aplaice

This comment has been minimized.

axelboc and others added 3 commits June 7, 2020 16:38
@axelboc
Copy link
Collaborator Author

axelboc commented Jun 7, 2020

Well spotted for Niue and Cook Islands!! 👓

You're totally right about prioritising entity types. I've done as you suggested:

  • clarified CONTRIBUTING.md,
  • removed Niue and Cook Islands from the list of dependent territories,
  • moved French Guiana to the list of transcontinental territories,
  • removed Ceuta and Melilla from the list of enclaves/exclaves,
  • updated the removal stats in Apply inclusion rules for political entities #312 (review).

@aplaice
Copy link
Collaborator

aplaice commented Jun 7, 2020

I don't see any remaining issues and I don't think we'll catch anything more, (and even if we do, we can just fix it on master), so I'd vote for merging!


Apparently even people who are dedicated enough to file issues about the flag of the Northern Mariana Islands don't mind its removal, so I think we're definitely doing "the right thing", even if a 12 % removal is more than what I'd have expected at the very start.

@axelboc
Copy link
Collaborator Author

axelboc commented Jun 7, 2020

Yeah, it's reassuring 😄

Alright, let's do this!! 🤘

@axelboc axelboc merged commit ba99fae into master Jun 7, 2020
@axelboc axelboc deleted the political-entities branch June 7, 2020 16:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

Inclusion rules for political entities Missing British Overseas Territories
2 participants