Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Arabic / new translations; a couple of fixes; enhancements #1630

Closed
wants to merge 2 commits into from
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
103 changes: 50 additions & 53 deletions languages/SMW_LanguageAr.php
Expand Up @@ -19,27 +19,28 @@
*
* @author Mahmoud Zouari mahmoudzouari@yahoo.fr http://www.cri.ensmp.fr
* @author Meno25
* @author Ahmad Gharbeia أحمد غربية <ahmad@arabdigitalexpression.org>
* @ingroup SMWLanguage
* @ingroup Language
*/
class SMWLanguageAr extends SMWLanguage {

protected $m_DatatypeLabels = array(
'_wpg' => 'الصفحة', // name of page datatype
'_wpg' => 'صفحة', // name of page datatype
'_txt' => 'نص', // name of the text type
'_cod' => 'كود', // name of the (source) code type
'_boo' => 'منطقي', // name of the boolean type
'_num' => 'عدد', // name for the datatype of numbers
'_geo' => 'الإحداثيات الجغرافية', // name of the geocoord type
'_tem' => 'الحرارة', // name of the temperature type
'_dat' => 'التاريخ', // name of the datetime (calendar) type
'_ema' => 'البريد الإلكتروني', // name of the email type
'_uri' => 'مسار', // name of the URL type
'_anu' => 'التعليق علي معرف الموارد الموحد', // name of the annotation URI type (OWL annotation property)
'_tel' => 'رقم الهاتف', // name of the telephone (URI) type
'_rec' => 'تسجيل', // name of record data type
'_qty' => 'Quantity', // name of the number type with units of measurement //TODO: translate
'_mlt_rec' => 'Monolingual text',
'_geo' => 'إحداثيات جغرافية', // name of the geocoord type
'_tem' => 'درجة حرارة', // name of the temperature type
'_dat' => 'تاريخ', // name of the datetime (calendar) type
'_ema' => 'عنوان بريد إلكتروني', // name of the email type
'_uri' => 'URL', // name of the URL type
'_anu' => 'URL حاشية', // name of the annotation URI type (OWL annotation property)
'_tel' => 'رقم هاتفي', // name of the telephone (URI) type
'_rec' => 'ّسجل', // name of record data type
'_qty' => 'كميّة', // name of the number type with units of measurement
'_mlt_rec' => 'نص أحادي اللغة',
);

protected $m_DatatypeAliases = array(
Expand All @@ -52,62 +53,58 @@ class SMWLanguageAr extends SMWLanguage {

protected $m_SpecialProperties = array(
// always start upper-case
'_TYPE' => 'لديه نوع',
'_URI' => 'معرف الموارد الموحد معادلة',
'_SUBP' => 'الخاصية الفرعية ل',
'_SUBC' => 'تصنيف فرعي من',
'_UNIT' => 'عرض الوحدات',
'_IMPO' => 'المستوردة من',
'_CONV' => 'يقابل',
'_SERV' => 'يوفر الخدمة',
'_PVAL' => 'يسمح بالقيمة',
'_TYPE' => 'من النوع',
'_URI' => 'URL مكافئ',
'_SUBP' => 'خصيصة فرعية من',
'_SUBC' => 'مُصنّف على التصنيف',
'_UNIT' => 'وحدة العرض',
'_IMPO' => 'مستوردة من',
'_CONV' => 'توافق',
'_SERV' => 'تقدّم خدمة',
'_PVAL' => 'تقبل القيمة',
'_MDAT' => 'تاريخ التعديل',
'_CDAT' => 'Creation date', // TODO: translate
'_NEWP' => 'Is a new page', // TODO: translate
'_LEDT' => 'Last editor is', // TODO: translate
'_ERRP' => 'يمتلك قيمة غير صحيحة ل',
'_LIST' => 'يمتلك حقول',
'_SOBJ' => 'Has subobject', // TODO: translate
'_ASK' => 'Has query', // TODO: translate
'_ASKST'=> 'Query string', // TODO: translate
'_ASKFO'=> 'Query format', // TODO: translate
'_ASKSI'=> 'Query size', // TODO: translate
'_ASKDE'=> 'Query depth', // TODO: translate
'_ASKDU'=> 'Query duration', // TODO: translate
'_MEDIA'=> 'Media type',
'_MIME' => 'MIME type',
'_ERRC' => 'Has processing error',
'_ERRT' => 'Has processing error text',
'_PREC' => 'Display precision of',
'_LCODE' => 'Language code',
'_TEXT' => 'Text',
'_PDESC' => 'Has property description',
'_PVAP' => 'Allows pattern',
'_DTITLE' => 'Display title of',
'_PVUC' => 'Has uniqueness constraint',
'_CDAT' => 'تاريخ الإنشاء',
'_NEWP' => 'هي صفحة جديدة',
'_LEDT' => 'آخر مَن حرّرها',
'_ERRP' => 'فيها قيمة غير صحيحة في',//الخصيصة
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does not look ok to me.

Copy link
Contributor Author

@agharbeia agharbeia Jun 4, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I can see the diff is misaligned for some reason but all of the strings are there.
Or are you concerned with the inline comment causing the BiDi mess? It's just visual. The code is correct as far as I can see.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are you concerned with the inline comment causing the BiDi mess

Yeah I am, but on the other hand the automated tests would ring alarm bells if this was breaking PHP and so. Since this is not the case...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a bug in how GitHub viewer handles BiDi text. The patch itself is correct. Regarding the translations, I have 2 objections:
1-The change in line 108 (protected $m_monthsshort): Gharbeia, you know well that those short names are simply not used in the Arab world (or at least not widely used outside the Arab Eyes people). It doesn't make sense to add short names by yourself if most native speakers don't know them. I advise against merging that specific line.
2-I am opposed to using tashkil (تشكيل or Arabic diacritics) in the translations especially in things that can't be changed directly using the user interface such as the case here. Tashkil looks bad on screens with low resolutions and it makes the "Find" or "Search" function in the web browser useless. Also, some diacritics are hard to be found when using an Arabic keyboard.
I made a new patch based on Gharbeia's one but without line 108 and tashkil.

mypatch.txt

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Meno25 Thanks a lot for your elaborate input!

So to sum up as someone who neither speaks nor understands Arabic: Apart from the strong concerns mentioned in 1 the translations as such are ok except for the way these are actually written. I assume that MediaWiki is not using diacritics either so I makes sense to use your patch. To clarify 1 should be easy too since MediaWiki also uses short names. For consistency these should be used here too.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The issue of using diacritics is rather controversial on Arabic Wikipedia.

I do not think that we should try to solve a problem here which exists on Arabic Wikipedia for 15 years - obviously without conclusion.

Anyway, the Arabic MediaWiki translations file mostly doesn't contain diacritics.

So that's what we should do here, too. Note at all to avoid misunderstandings: I am not judging whether diacritics are better to use or not.

@Meno25 The reason I asked for a second opinion was just to make sure that the translations as such are sensible and understandable in the light of their context. This seems to be the case.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your review and feed back.

Let me first comment about the rationale: You probably know, Meno, that many norms and widely accepted conventions of localisation have been changed and re-thought during the past ten years of FOSS localisation. Including the introduction of idioms that hadn't existed before, and bringing others under critique. Whether those changes are for the better or for the worse. This will remain controversial. I personally think that much remains in need of rethinking. I also know that the Arabic Wikipedia community has become increasingly more conservative; not only in regard to localisation.

Regarding the short months names case: I know they are used in Gnome, a major FOSS package, as well as the Mozilla suite.

Language - and modes thereof - is just another technology of communication. It has phases. ICT has introduced modes and idioms that hadn't existed before, and I don't see why we should insist to sticking to what has been known before in this specific point, i.e. month names. For me if is is useful, it should be allowed; people should be given the chance to try them. After all in all applications I know of, dates formats are selectable. No imposing of anything is taking place here.

As for tashkeel: You probably know that it is an essential part of the Arabic writing system, without which it is not considered complete. The absence of tashkeel from computing has been traditionally - in my opinion - due to the difficulties of presentation; i.e fonts, resolutions, encoding. This is all mostly of the past now. What remains of these difficulties should be tackled technically, if possible, rather than imposing restrictions in how the language is used. The language is and will continue to change because of the effects of technology. I just don't see this change and rational as valid or necessary. An example is the search function in browsers, which should implement switches to ignore tashkeel, as what popular search engines are doing. in turn search engines SHOULD implement switches to not ignore tashkeel because this can make a hell lot of a difference in search results in some cases, such as classical texts. The Arabic keyboard is evil, but that's another discussion.

Most importantly, the absence of tashkeel causes ambiguity and/or prevents the use of concise expressions that are arguably more suitable in certain cases, for fear of ambiguity.

I personally think that since SMW is not yet an integral part of Wikipedia, and that SMW is almost not used in Arabic mediawikis at all, it should not be strictly governed by WP-ar conventions. At least not without a space for reconsidering solutions.

In all cases, for the sake of the projects I'm working on, I will have to maintain a different translation for this and some other modules, as we do for our base MW installations, and hope that new insights in the discussions will bring them closer in the future. Even though I doubt this will be possible, if the rationale is basically "we need to do it this way because this is how it has been done so far in those other places" due of the exponential difficulty of effecting change in the body of localisation texts, while in fact fixing a small part and rarely used module such as SMW should be independently possible; but alas.

Thank you all, and regards.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whether those changes are for the better or for the worse. This will remain controversial.

I agree here.

I also know that the Arabic Wikipedia community has become increasingly more conservative; not only in regard to localisation.

Maybe you are right. However, Arabic Wikipedia is the largest Arabic language wiki on the Internet. So, the opinion of the Arabic Wikipedia community should be taken into consideration here. After all, Semantic MediaWiki is a MediaWiki extension and it was we (Arabic Wikipedia users) who started the Arabic MediaWiki translation in the first place.

I know they are used in Gnome, a major FOSS package, as well as the Mozilla suite.

Both were translated by Arab Eyes people if I was not mistaken.

The Arabic keyboard is evil, but that's another discussion.

I agree.

For me if is is useful, it should be allowed; people should be given the chance to try them.

I respect your opinion but we disagree here. We simply don't "invent" language. In Wikipedia, we use only words and phrases commonly used in Arabic sources/literature.

that it is an essential part of the Arabic writing system

Yes, I know. Both of us have more than 10 years experience with Wikipedia. However, you only make a few edits every month while I do make hundreds of edits every week. That's why we disagree here. The issue of diacritics is very annoying when are trying to find a certain word in 150 kilobytes article/webpage and you don't have time to read all the article/webpage. I agree that web browsers and search engines code should be modified as you described. However until this happens, let's not make things harder for very active users (who make most edits in any wiki as you know).

Another note: You know well that most Arabic websites don't use diacritics for the reasons you and I stated here (due to the difficulties of presentation; i.e fonts, resolutions, encoding).

This is all mostly of the past now.

Not in my opinion.

prevents the use of concise expressions that are arguably more suitable in certain cases, for fear of ambiguity.

True. However, the problem of using Arabic diacritics has no ideal solution. We are just weighting pros and cons.

Bottom line: I have stated my opinion here and I have nothing further to add. I have already submitted a patch and the decision of whether to merge your patch or mine is up to kghbln. I was happy to have this conversation with you and thank you.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also know that the Arabic Wikipedia community has become
increasingly more conservative; not only in regard to localisation.

Maybe you are right. However, Arabic Wikipedia is the largest Arabic
language wiki on the Internet. So, the opinion of the Arabic Wikipedia
community should be taken into consideration here. After all, Semantic
MediaWiki is a MediaWiki extension and it was we (Arabic Wikipedia
users) who started the Arabic MediaWiki translation in the first place.

In my opinion it doesn't matter who started the translation in the
first place. Many first translations of software packages are worthless.
What I mean here is that first instance in itself should not be a
determinant.

I know they are used in Gnome, a major FOSS package, as well as the
Mozilla suite.

Both were translated by Arab Eyes people if I was not mistaken.

If this is so, does it automatically mean they are not orthodox?
I hope you're not governing on the work just by the identity of the
maker. If the point your putting forward is existing usability, then
existence in such popular software packages could mean these proposed
short-forms are probably used by many, regardless of how they got there.
Just like new words that pop-up everyday in all spoken languages. I
don't claim to have numbers, though.

For me if is is useful, it should be allowed; people should be given
the chance to try them.

I respect your opinion but we disagree here. We simply don't "invent"
language. In Wikipedia, we use only words and phrases commonly used in
Arabic sources/literature.

I think you're carrying the "no original work" rule from Wikipedia
beyond its applicability here. All localisation of software and
technology entails original creation. Technology has already brought new
modes of language into usage. The Wikipedia jargon is itself an example
of that.

that it is an essential part of the Arabic writing system

Yes, I know. Both of us have more than 10 years experience with
Wikipedia. However, you only make a few edits every month while I do
make hundreds of edits every week. That's why we disagree here. The
issue of diacritics is very annoying when are trying to find a certain
word in 150 kilobytes article/webpage and you don't have time to read
all the article/webpage. I agree that web browsers and search engines
code should be modified as you described. However until this happens,
let's not make things harder for very active users (who make most edits
in any wiki as you know).

I think it's a matter of egg and chicken. If we put rules in place to
inhibit use because of limitations of technology, then it is less likely
for technology to solve.

Moreover, this difficulty could be valid in body text of articles. But
not in UI, right?

Regards,
Ahmad

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for all your input. However that's exactly the kind of discussion I wanted to avoid since I believe that it should be done on the MediaWiki level. I had hoped that my #1630 (comment) was unambiguous in this respect.

Here we should stick to what MediaWiki is doing and not what Gnome, KDE or Cinnamon are doing. This may very well be disappointing from one point or view or the other but I believe it is preferable to have consistency within the MediaWiki environment.

You can have issues in a lot of languages, e.g. in German translations MediaWiki writes "du" instead of "Du" as salutation which I feel is impolite or even rude. Both versions are formally possible and "du" was chosen by the community. So it stick to it for the sake of consistency which I very much prefer.

'_LIST' => 'يتأّلف من الحقول',//السجِّل
'_SOBJ' => 'تحوي الكائن الفرعي', //الصفحة
'_ASK' => 'فيها الاستعلام',//الصفحة
'_ASKST'=> 'نص الاستعلام',
'_ASKFO'=> 'صيغة الاستعلام',
'_ASKSI'=> 'حجم الاستعلام',
'_ASKDE'=> 'عمق الاستعلام',
'_ASKDU'=> 'مدّة الاستعلام',
'_MEDIA'=> 'نوع الميديا',
'_MIME' => 'نوع MIME',
'_ERRC' => 'بها خطأ في المعالجة',
'_ERRT' => 'لها وصف الخطأ في المعالجة',
'_PREC' => 'تعرض الدّقة إلى',
'_LCODE' => 'رمز اللغة',
'_TEXT' => 'نص',
'_PDESC' => 'لها وصف الخصيصة',
'_PVAP' => 'تقبل النمط',
'_DTITLE' => 'تعرض العنوان',
'_PVUC' => 'مقيّدة بالتفرّد',
);

protected $m_SpecialPropertyAliases = array(
'عرض الوحدة' => '_UNIT'
'وحدة العرض' => '_UNIT'
);

protected $m_Namespaces = array(
SMW_NS_PROPERTY => 'خاصية',
SMW_NS_PROPERTY_TALK => 'نقاش_الخاصية',
SMW_NS_PROPERTY => 'خصيصة',
SMW_NS_PROPERTY_TALK => 'نقاش_الخصيصة',
SMW_NS_TYPE => 'نوع',
SMW_NS_TYPE_TALK => 'نقاش_النوع',
SMW_NS_CONCEPT => 'مبدأ',
SMW_NS_CONCEPT_TALK => 'نقاش_المبدأ'
SMW_NS_CONCEPT => 'مفهوم',
SMW_NS_CONCEPT_TALK => 'نقاش_المفهوم'
);

protected $m_dateformats = array( array( SMW_Y ), array( SMW_MY, SMW_YM ), array( SMW_MDY, SMW_DMY, SMW_YMD, SMW_YDM ) );

protected $m_months = array( "يناير", "فبراير", "مارس", "أبريل", "مايو", "يونيو", "يوليو", "أغسطس", "سبتمبر", "أكتوبر", "نوفمبر", "ديسمبر" );

protected $m_monthsshort = array( "يناير", "فبراير", "مارس", "أبريل", "مايو", "يونيو", "يوليو", "أغسطس", "سبتمبر", "أكتوبر", "نوفمبر", "ديسمبر" );
protected $m_monthsshort = array( "ينا", "فبر", "مارس", "أبريل", "ماي", "يوني", "يولي", "غسط", "سبت", "أكت", "نوف", "ديس" );

}