What data to include in Life Version? #3

mfranke93 · 2021-12-22T10:48:05Z

Initially opened by gs108488 in 211@TIK. History:

gs108488 Dec 15, 2021:

Similar to issue #172, we need to define which data is included in the Life Version?

tutebatti · 2022-01-05T13:36:37Z

"Persons" should not appear in the Public Version. Right, @rpbarczok?

rpbarczok · 2022-01-18T09:25:21Z

Yes, they should not.

tutebatti · 2022-01-18T09:27:55Z

Thank you. @rpbarczok , please also see the original question referred to in the first post here, with linking to #172 in the old repository of the github instance hosted by Uni Stuttgart.

rpbarczok · 2022-01-18T16:30:03Z

@mfranke93: Would it be a problem if I changed the short titles of the sources. At the moment they are not very self explanatory or in German.

mfranke93 · 2022-01-18T16:32:28Z

Not at all.

mfranke93 · 2022-01-19T09:16:04Z

Only the following tags should be visible:

@rpbarczok What about the tag Metropolitan Residence?

Also, since you mentioned something about renaming the source short names yesterday, please make sure the source list is spelled correctly here. Right now, there are 31 sources in your list, but only 28 in the database that exactly match those short names.

mfranke93 · 2022-01-19T09:41:10Z

These four tags from your list do not exist:

First degree capital
Metropoliten residence
Second degree capital
Third degree capital

I assume the capitals miss the hyphen (first-degree...) and the other should be "metropolitan",

The sources are fine now.

rpbarczok · 2022-01-19T09:45:24Z

And again:
Evidences:

All evidences that are visible and marked by the tag "DhiMu"

Only the following tags should be visible:

Bishopric
Metropolitan see
Patriarchate
Diocese without specified ranking
Part of an episcopal see
Episcopal residence
Metropolitan residence
Patriarchal residence
First-degree capital
Second-degree capital
Third-degree capital
Mosque or Madrasa
Synagogue or Yeshiva or Beth Midrash or Beth Din
Church or Monastery
Reviewed

Following sources:

Fiey, OCN
DHGE
EI 2
EIran
EJ
AKg
TIB 5
CE
EI 1
EI 3
TIB 15
Vest, Melitene
TIB 2
Munier, Eglise copte
EJIW
Hamilton, Latin Church
Timm, Christlich-koptische Ägypten
Muqaddasi (arab.)
Muqaddasi (engl.)
Synode of Sis 1307 (lat.)
Synode of Sis 1307 (arm.)
Synode of Adana 1316 (lat.)
Synode of Adana 1316 (arm.)
Synode of Sis 1342 (lat.)
Benjamin of Tudela (engl.)
Benjamin of Tudela (hebr.)
PmbZ
Letter by Benedictus XII 1341 (lat.)
Coronation of Levon I (fr.)
Council of Hromklay 1179
Schick, Christian Communities of Palestine

mfranke93 · 2022-01-19T09:51:19Z

I'm not sure where you keep copying this from, but please stop, because I have to copy it out of here again, and there are still typos in there. I started keeping my own list based on the database even with the old issue at TIK.

To summarize:

Regarding tags, nothing changes as opposed to the old status: Remove the tags DhiMu, Annotator beta test, eOC, Non-residential, Community.
Keep evidence from the sources in the list above, and the sources themselves.

rpbarczok · 2022-01-19T09:56:54Z

I copied it from the visualisation.

To your summary: Yes

mfranke93 · 2022-01-19T10:14:21Z

The problem is that for the data export, the comparison of texts is done exactly, so if the tag names or source short names do not match *exactly,* the data in question will just not be ignored. So I would suggest copy-pasting directly from pgAdmin or the visualization and not typing it. There is already a difference between “First degree capital” and “First-degree capital”, and also between “Metropolitan residence” (your capitalization) and “Metropolitan Residence” (DB and visualization capitalization). But I think we are in agreement now. Keep in mind that, if you plan to rename some tags or continue editing the source short names, we will need to update this list again. Von: rpbarczok ***@***.***> Gesendet: Mittwoch, 19. Januar 2022 10:57 An: UniStuttgart-VISUS/damast ***@***.***> Cc: Max Franke ***@***.***>; Mention ***@***.***> Betreff: Re: [UniStuttgart-VISUS/damast] What data to include in Life Version? (Issue #3) I copied it from the visualisation. To your summary: Yes — Reply to this email directly, view it on GitHub<#3 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/ACHQP3XMY4FJHUBKESTTGZDUW2DHDANCNFSM5KSIHSHQ>. Triage notifications on the go with GitHub Mobile for iOS<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675> or Android<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>. You are receiving this because you were mentioned.Message ID: ***@***.***>

rpbarczok · 2022-01-19T10:20:58Z

OK. Understood. Sorry for the mess. I am in some stress at the moment, since I have to organize my departure from the project.

It does only affect the sources and the tags, though?

mfranke93 · 2022-01-19T12:10:41Z

OK. Understood. Sorry for the mess. I am in some stress at the moment, since I have to organize my departure from the project.

Understandable, and no problem. I just want to make sure we don't accidentally include too little data in the public version (by excluding a source or the like).

It does only affect the sources and the tags, though?

Yes. There is other stuff going on when doing the data cleanup, but it all bases on the set of tags and sources to include. For reference, here is the current SQL script for doing that. However, I need to add the person stuff, because for the reviewer version, there were still persons involved.

@tutebatti @rpbarczok This comment regarding persons not being included: Does that affect the DaRUS export as well, or do we include the persons there?

tutebatti · 2022-01-19T12:51:45Z

Since the DaRUS export is what we will be cited with and the data other people will (re)use - as far as I understand at least -, I would not include persons nor anything which is still somehow "work in progress" or does not figure under the label DhiMu, e.g. is in the scope of eOC.

rpbarczok · 2022-01-19T12:57:28Z

I believe that we have added people only for Bar Hebraeus and Michael the Great so far, and these sources are not part of the DhiMu. So, the people in the repository would be without any connection to any evidence.

rpbarczok · 2022-01-19T13:09:34Z

Ok, I just looked it up, there is at least one bishop connected to some evidences, but I would agree with Florian. We only added a few people for testing the process, so there is no gain in adding them

mfranke93 · 2022-01-19T13:11:32Z

Since the process for exporting the data is removing unneeded stuff rather than just adding what is needed: Am I correct to say: for the public version and the DaRUS dump, all person data should be removed?

rpbarczok · 2022-01-19T13:12:04Z

yes

mfranke93 · 2022-01-19T13:25:13Z

Okay. I have modified and tested the export script, and everything looks good. Just let me know when you are done checking the names.

rpbarczok · 2022-03-01T09:10:45Z

As you have mentioned it in issue #92: I haven't thought about the fact that all the places are in the Darus dump. I thought since places without evidence are not visible on the map, they just something I do not have to concern with. Sorry for the premature assumption.
So: Please remove all places from the live version und the Darus dump that have no evidence with an DhiMu tag!

rpbarczok · 2022-03-09T16:01:40Z

I want to summarize the above discussion: Basically all information are already mentioned above, with one exception: We changed the short title of the sources and added two additional sources. To minimize misunderstandings, I added the primary key to the tags and the sources.
Concerning the data dump for Darus and consequently in the DB of our live web application will contain:
1.) all pieces of evidence that a) have the tag DhiMu (16) and b) are connected to a specified set of sources.
2.) The specific set of sources contains following sources: OrChrN (1), DHGE (2), EI² (3), TIB15 (15), EI (13), Timm (26), EJ (5), Vest (16), PMBZ (42), Hamilton (25), Munier (19), Muqaddasi ara (29), Muqaddasi eng (30), EJIW (20), EIr (4), CoptEnc (11), TIB5 (10), EI³ (14), Schick (63), TIB2 (17), Benjamin eng (40), Benjamin heb (41), Sis 1307 lat (34), Sis 1307 arm (35), AtKG (8), Hromklay 1179 (58), Smbat 1199 fra (57), Sis 1342 lat (39), Adana 1316 arm (37), Adana 1316 lat (38), Benedictus PP. XII (43), Crown (68), Richter-Bernburg (69).
3.) Data that are not connected to these evidences should not be part of the data dump. That means that all places that do not have a piece of evidence that is described by 1.) are to be removed from the dump.
4.) All person data should be removed.
5.) The following tags have to be removed: Community (11), DhiMu (16), eOC (17), Non-residential (19), Annotator beta test (100)
6.) Concerning the religious group of 12S, 7S, 5S.
a. If possible: In the Visualisation they should be treated as Shia (also in respect to the filter function), in the dump the data should be stay as they are.
b. If a. is not possible (or if only possible with considerable amount of work) we have to remove the mentioned group and add the specifics in the religion instance comment.

We are still not finished with the revision of the data, so I have to ask for your patience concerning the final data dump

mfranke93 · 2022-03-10T09:09:10Z

Looks good to me.
Regarding 6:

a. If possible: In the Visualisation they should be treated as Shia (also in respect to the filter function), in the dump the data should be stay as they are.

Possible in theory, but I really do not want to do that. The software should stay data-agnostic.

b. If a. is not possible (or if only possible with considerable amount of work) we have to remove the mentioned group and add the specifics in the religion instance comment.

Would that only happen for the exported data, or for the production database as well? We can automate this as well, one question though: what would happen if there is already a religion instance comment?

tutebatti · 2022-03-10T09:20:04Z

We can automate this as well, one question though: what would happen if there is already a religion instance comment?

If that is possible, this would be preferable, I'd say. Since the information in the column religion instance comment is not structured in any machine-readable way, I would suggest to add the strings 12S, 7S, and 5S, respectively, at the beginning of the comment column followed by a semicolon and space (; ) and move the rest of the content of that column behind it.

rpbarczok · 2022-03-10T09:21:52Z

I agree.

mfranke93 · 2022-03-10T09:25:25Z

Okay. What about if there is no ~~place~~ religion instance comment? Would you then rather have the comment be 7S, or 7S; ? Slight bikeshedding here, but I'd rather ask ;D

tutebatti · 2022-03-10T09:27:34Z

Got us there! I would then omit the ; . However, you mean religion instance comment, right? (Got you back? ;) )

rpbarczok · 2022-03-10T09:40:54Z

One question: Would it be possible to transform 5S to Zayidiya, 7s to Isma'iliya and 12S to Imamiya. That would be more conform to the names we use in the comments.

mfranke93 · 2022-03-10T09:45:08Z

You mean specifically for the comments, right? Like: Imamiya; this is the rest of the religion instance comment

rpbarczok · 2022-03-10T09:49:47Z

Exactly

mfranke93 · 2022-03-10T09:55:49Z

No problem at all. To summarize:

Religion instances of 5S, 7S and 12S change to being instances of SHIA. Their comments get prepended with "Zayidiya; ", "Isma'iliya; ", and "Imamiya; ", respectively. If the religion instance comment is empty to start with, the semicolon and space at the end are omitted.
The religions 5S, 7S, and 12S are dropped.

rpbarczok · 2022-03-10T09:58:52Z

Yes, perfect.

tutebatti · 2022-03-10T10:00:35Z

Just to be sure: This only happens via the export. The data in the production database stays as is, right?

mfranke93 · 2022-03-10T10:05:45Z

Just to be sure: This only happens via the export. The data in the production database stays as is, right?

That is what wanted to know here, but for the ensuing discussion, I assumed that this only applies to the data export. To be sure, please confirm that is what you meant as well, @rpbarczok .

I wouldn't do such an invasive edit to the production database without making sure that is actually what you wanted, do not worry 😄

tutebatti · 2022-03-10T10:07:35Z

I wouldn't do such an invasive edit to the production database without making sure that is actually what you wanted, do not worry

Would have never thought that, but to do bikeshedding as well I wanted to add that to the summary. ;)

rpbarczok · 2022-03-10T11:56:31Z

I confirm.

mfranke93 · 2022-03-10T12:22:07Z

Update: For the current database state, the Shia changes affect 28 36 entries. I have listed them below, how they would look after the proposed change. If you see any issues with the data (like the comments of 4433 or 1431), you can fix them via GeoDB-Editor quite easily now.

evidence_id	place_name	religion	time_spans	religion_instance_comment
4359	Multan	Shia	{"[900,1176)"}	Isma'iliya
4291	Makka	Shia	{"[951,1347)"}	Zayidiya
4433	Samarra	Shia	{"[848,1401)"}	Imamiya; Imamiyya, since 941 Twelver Schia
1616	al-Ahsa	Shia	{"[926,1077)"}	Isma'iliya
4616	al-Ahsa	Shia	{"[900,1075)"}	Isma'iliya
9579	Afamiya	Shia	{"[1095,1096)"}	Isma'iliya
9813	al-Rayy	Shia	{"[865,886)"}	Zayidiya
9818	al-Rayy	Shia	{"[865,886)"}	Zayidiya
1189	al-Ahwaz	Shia	{"[899,1001)"}	Isma'iliya
1303	Isfahan	Shia	{"[1100,1126)"}	Isma'iliya
1304	Isfahan	Shia	{"[1098,1104)"}	Isma'iliya; Batiniyya
1317	Isfahan	Shia	{"[1072,1195)"}	Isma'iliya
1261	Bukhara	Shia	{"[1044,1046)"}	Isma'iliya
1446	Kashan	Shia	{"[1198,1199)"}	Imamiya
1431	Samarra	Shia	{"[848,1401)"}	Imamiya; Imamiyya since 941 Twelver Shia
1441	Nahawand	Shia	{"[1300,1401)"}	Imamiya
1470	Qayin	Shia	{"[1040,1401)"}	Isma'iliya
1617	al-Ahsa	Shia	{"[913,914)"}	Isma'iliya
1522	Halab	Shia	{"[1120,1121)"}	Isma'iliya
1521	Halab	Shia	{"[944,1071)"}	Imamiya
1525	Halab	Shia	{"[900,1151)"}	Imamiya
1543	San'a'	Shia	{"[1047,1174)"}	Isma'iliya
1544	San'a'	Shia	{"[1381,1401)"}	Zayidiya
1546	San'a'	Shia	{"[1061,1177)"}	Isma'iliya
1547	San'a'	Shia	{"[1375,1401)"}	Zayidiya
1614	al-Ahsa	Shia	{"[926,1052)"}	Isma'iliya
1552	Makka	Shia	{"[951,1347)"}	Zayidiya
1538	Multan	Shia	{"[900,1176)"}	Isma'iliya
1623	Suhar	Shia	{"[943,954)"}	Isma'iliya
2617	Banyas	Shia	{"[1126,1131)"}	Isma'iliya
2889	al-Hilla	Shia	{"[1262,1277)"}	Imamiya
3271	al-Dinawar	Shia	{"[985,989)"}	Zayidiya; 5er Shia (Zayidiyya) "[T]hey belong to the school of Sufyan al-Thawri."
3312	Jabala	Shia	{"[1165,1174)"}	Isma'iliya
3289	Sa'da	Shia	{"[897,1401)"}	Zayidiya
3550	al-Ahsa	Shia	{"[985,989)"}	Isma'iliya
3834	Sa'da	Shia	{"[1200,1201)"}	Zayidiya

rpbarczok · 2022-03-10T12:29:51Z

We changed the religion_instance_comment anyway, as Florian mentioned here. I am only confused that the German form "Apamea am Orontes" is displayed, I changed that to "Afamiya" some time ago.

mfranke93 · 2022-03-10T12:41:38Z

We changed the religion_instance_comment anyway, as Florian mentioned #92 (comment). I am only confused that the German form "Apamea am Orontes" is displayed, I changed that to "Afamiya" some time ago.

Yes, my bad. I tested it on the testing database first, and that is where I got that output. I updated it for the production database, where it is 36 entries.

rpbarczok · 2022-03-10T12:47:16Z

OK, I changed the comments of the 4433, 1431 and 3271 in our religion_comment.csv accordingly.

tutebatti · 2022-03-16T09:51:38Z

I think we can close this, right? @mfranke93 @rpbarczok

mfranke93 added help wanted Extra attention is needed question Further information is requested paused This issue is paused, waiting for external input or events. labels Dec 22, 2021

tutebatti assigned tutebatti and mfranke93 and unassigned tutebatti Jan 10, 2022

mfranke93 removed their assignment Jan 10, 2022

tutebatti assigned rpbarczok Jan 11, 2022

UniStuttgart-VISUS deleted a comment from rpbarczok Jan 19, 2022

mfranke93 added this to To do in Public Instance at HU via automation Jan 20, 2022

mfranke93 moved this from To do to In progress in Public Instance at HU Jan 20, 2022

This was referenced Jan 31, 2022

Explanation of "Aggregation of religious groups" #77

Closed

Change explanations in list "See also" in Place URI page #86

Closed

This was referenced Feb 9, 2022

Keeping data synchronous between public version and productive version #91

Closed

"NULL" vs. empty in column "comment" of Place table #92

Closed

tutebatti removed the paused This issue is paused, waiting for external input or events. label Mar 10, 2022

mfranke93 closed this as completed Mar 16, 2022

Public Instance at HU automation moved this from In progress to Done Mar 16, 2022

mfranke93 mentioned this issue Mar 23, 2022

Changes on the automatically generated report #112

Closed

What data to include in Life Version? #3

What data to include in Life Version? #3

Comments

mfranke93 commented Dec 22, 2021

tutebatti commented Jan 5, 2022

rpbarczok commented Jan 18, 2022

tutebatti commented Jan 18, 2022

rpbarczok commented Jan 18, 2022

mfranke93 commented Jan 18, 2022

mfranke93 commented Jan 19, 2022 • edited

mfranke93 commented Jan 19, 2022 • edited

rpbarczok commented Jan 19, 2022

mfranke93 commented Jan 19, 2022

rpbarczok commented Jan 19, 2022

mfranke93 commented Jan 19, 2022 via email

rpbarczok commented Jan 19, 2022

mfranke93 commented Jan 19, 2022

tutebatti commented Jan 19, 2022

rpbarczok commented Jan 19, 2022

rpbarczok commented Jan 19, 2022

mfranke93 commented Jan 19, 2022

rpbarczok commented Jan 19, 2022

mfranke93 commented Jan 19, 2022

rpbarczok commented Mar 1, 2022

rpbarczok commented Mar 9, 2022

mfranke93 commented Mar 10, 2022

tutebatti commented Mar 10, 2022

rpbarczok commented Mar 10, 2022

mfranke93 commented Mar 10, 2022 • edited

tutebatti commented Mar 10, 2022 • edited

rpbarczok commented Mar 10, 2022

mfranke93 commented Mar 10, 2022

rpbarczok commented Mar 10, 2022

mfranke93 commented Mar 10, 2022

rpbarczok commented Mar 10, 2022

tutebatti commented Mar 10, 2022

mfranke93 commented Mar 10, 2022

tutebatti commented Mar 10, 2022

rpbarczok commented Mar 10, 2022

mfranke93 commented Mar 10, 2022 • edited

rpbarczok commented Mar 10, 2022

mfranke93 commented Mar 10, 2022 • edited

rpbarczok commented Mar 10, 2022

tutebatti commented Mar 16, 2022

mfranke93 commented Jan 19, 2022 •

edited

mfranke93 commented Jan 19, 2022 •

edited

mfranke93 commented Mar 10, 2022 •

edited

tutebatti commented Mar 10, 2022 •

edited

mfranke93 commented Mar 10, 2022 •

edited

mfranke93 commented Mar 10, 2022 •

edited