Example when explaining regular expressions for "Place search" #67

tutebatti · 2022-01-24T12:15:05Z

In the current example in the info text for the search of places, one reads:

The search field supports JavaScript-style regular expressions. For example, to search for locations with an Arabic definite article, the query \ba([tdrzsṣḍṭẓln]|[tds]h)- can be used.

If I understand correctly from the list of places, we do not use the DMG notation for Arabic articles (cf. https://de.wikipedia.org/wiki/DIN_31635). That example makes little sense, then. Any better suggestions. @rpbarczok, you probably no the data itself better than @mfranke93?

The text was updated successfully, but these errors were encountered:

mfranke93 · 2022-01-24T12:20:18Z

I think at the time I wrote that example, at least some places did. I don't think any do anymore. The example might be too complex for casual users anyways, but I thought it was neat to demonstrate what it could be used for ;) feel free to do some simplification here. Maybe this doesn't have to be so detailed, and we can link here for power users that want to do more than normal text search.

mfranke93 · 2022-01-24T12:23:40Z

By the way: The code that sorts the place names alphabetically uses that exact RegEx to this day so that the Arabic definite articles don't affect the sorting, both for the visualization and the reports. This is also why this, and the initial apostrophe, are aligned differently in the location list.

tutebatti · 2022-01-24T12:27:33Z

I could think of something simple like

searching for Bagh?dad would find "Bagdad" as well as "Baghdad", because h followed by ? matches zero or exactly one h.

Linking to the external documentation is good, too.

Btw, why is searching Bagdad finding "Baghdad" already now? As far as I can see, the former is not listed under "alternative names".

mfranke93 · 2022-01-24T12:39:38Z

I could think of something simple like

searching for Bagh?dad would find "Bagdad" as well as "Baghdad", because h followed by ? matches zero or exactly one h.

I like it!

Btw, why is searching Bagdad finding "Baghdad" already now? As far as I can see, the former is not listed under "alternative names".

Because "Bagdad" appears in the simplified column of the alternative names (name_var) table. That is one of the places searched. Why the place search claims "external URI matches" is beyond me though. That is a bug (#68).

tutebatti · 2022-01-24T13:01:52Z

I like it!

👍

Because "Bagdad" appears in the simplified column of the alternative names (name_var) table.

But these simplified names are not displayed in the tooltip?

mfranke93 · 2022-01-24T13:10:28Z

No. The transcription is. See also:

tutebatti · 2022-01-24T14:40:20Z

I'm not sure if there's a misunderstanding, but I cannot see any section or something similar entitled transcription.

mfranke93 · 2022-01-24T14:55:41Z

transcription is a column for alternative names. The primary name of a place is always transcribed already, but for alternative names, it could for example be in Arabic script, and then the transcription would provide a "European-readable" version of the name. If you look at the URI page for Baghdad, it is what is written in parentheses in the Arabic name variant (بغداد). This also appears in reports. There is no such section here because it is not an attribute of the place itself.

tutebatti · 2022-01-24T15:00:16Z

In other words, there is a match when searching because the term matches the simplified transcription of an alternative name?

At any rate, I will discuss this with @rpbarczok. I'm not sure how much of this behavior must be made transparent to the visitor who has no access to the db itself, but can only see the tooltip or the URI page which does not provide the simplified transcription either.

mfranke93 · 2022-01-24T17:00:37Z

In other words, there is a match when searching because the term matches the simplified transcription of an alternative name?

Yes. See https://github.tik.uni-stuttgart.de/frankemx/damast/issues/64

tutebatti · 2022-01-25T11:30:28Z

Ok. As @rpbarczok told me, simplified should be mostly consistent in that it represents (i.e., at least one of strings in simplified represents) a "normalized" form of transcription. It is sufficient to make that transparent to the user.

(It would be preferable, of course, if the transcription was automatically normalized according to given patterns and the results stored in a separate column. Apparently, this is not (easily) implementable. Entering simplified transcriptions manually is prone to errors.)

rpbarczok · 2022-01-25T11:57:31Z

I forgot to mention that we also add an english simplified transcription in the simplified table (e.g. gh, kh, j, sh etc.). Usually we use the simplified english transcription as the main name, but in the case that there is more than one Arabic variant. E.g. in the case of al-Ahsa. For the name variant هجر, we give the transcript Haǧar, and the simplified forms Hagar and Hajar.

tutebatti · 2022-01-26T11:47:55Z

but in the case that there is more than one Arabic variant

@rpbarczok, you mean "but only in the case that"...?

What is more, I'm not sure what to tell the user regarding what you stated.

mfranke93 · 2022-01-26T11:54:32Z

Just my 2 cents: We included this originally to make the search a bit more powerful and also forgiving. So, we wouldn't have to type names exactly (with the ǧ etc.), but could use a Latin g. Since this is quite hard to do only in software (there are a lot of letters with diacritics, hard not to miss some, ...) we decided it would be good to save the typical "latinified" names in the database. In my opinion, this is an implementation detail users do not need to know about at all. The only thing to communicate here would be that the search box is a bit more forgiving regarding exact spelling (or accepts variant spellings of places).

rpbarczok · 2022-01-26T12:02:49Z

I am sorry, the sentence was mutilated when editing it. What I mean is: For Arabic and other forms, we usually have one transcribed form in the transcription system of the DMG. Additional, we save the basic form of the letters in the simplified forms. We later decided also to include the simplified english trancription. So basically you can inform that the user usually should find a place also by entering the basic forms of the letters and by looking for a simplified English transcription, e.g. Hajar.

tutebatti · 2022-01-26T12:15:06Z

The only thing to communicate here would be that the search box is a bit more forgiving regarding exact spelling (or accepts variant spellings of places).

I might not be the average user in that case, but I would want to know how the search works exactly and how I can reproduce results. But I will certainly find an explanation (which you will correct, if necessary) for the current behavior. This is already pretty good:

So basically you can inform that the user usually should find a place also by entering the basic forms of the letters and by looking for a simplified English transcription, e.g. Hajar.

tutebatti added help wanted Extra attention is needed discussion labels Jan 24, 2022

tutebatti assigned mfranke93 and rpbarczok Jan 24, 2022

tutebatti closed this as completed Jan 25, 2022

rpbarczok reopened this Jan 25, 2022

mfranke93 added this to To do in Public Instance at HU via automation Jan 26, 2022

mfranke93 removed their assignment Jan 26, 2022

tutebatti closed this as completed Jan 26, 2022

Public Instance at HU automation moved this from To do to Done Jan 26, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Example when explaining regular expressions for "Place search" #67

Example when explaining regular expressions for "Place search" #67

tutebatti commented Jan 24, 2022

mfranke93 commented Jan 24, 2022

mfranke93 commented Jan 24, 2022 •

edited

tutebatti commented Jan 24, 2022

mfranke93 commented Jan 24, 2022

tutebatti commented Jan 24, 2022

mfranke93 commented Jan 24, 2022

tutebatti commented Jan 24, 2022

mfranke93 commented Jan 24, 2022 •

edited

tutebatti commented Jan 24, 2022

mfranke93 commented Jan 24, 2022

tutebatti commented Jan 25, 2022

rpbarczok commented Jan 25, 2022

tutebatti commented Jan 26, 2022 •

edited

mfranke93 commented Jan 26, 2022

rpbarczok commented Jan 26, 2022

tutebatti commented Jan 26, 2022

Example when explaining regular expressions for "Place search" #67

Example when explaining regular expressions for "Place search" #67

Comments

tutebatti commented Jan 24, 2022

mfranke93 commented Jan 24, 2022

mfranke93 commented Jan 24, 2022 • edited

tutebatti commented Jan 24, 2022

mfranke93 commented Jan 24, 2022

tutebatti commented Jan 24, 2022

mfranke93 commented Jan 24, 2022

tutebatti commented Jan 24, 2022

mfranke93 commented Jan 24, 2022 • edited

tutebatti commented Jan 24, 2022

mfranke93 commented Jan 24, 2022

tutebatti commented Jan 25, 2022

rpbarczok commented Jan 25, 2022

tutebatti commented Jan 26, 2022 • edited

mfranke93 commented Jan 26, 2022

rpbarczok commented Jan 26, 2022

tutebatti commented Jan 26, 2022

mfranke93 commented Jan 24, 2022 •

edited

mfranke93 commented Jan 24, 2022 •

edited

tutebatti commented Jan 26, 2022 •

edited