# Localizing XYZ French Translations

The French XYZ site translation has a higher conversion rate for users from France (~8%) compared to those in French-speaking countries outside France (~5%). Since the only translation was from a French speaker from France, this could be an issue of localization. French dialects differ from country to country, so it is possible that French speakers not native to France are not connecting with the French translation.

To test this hypothesis, localizations were provided for a number of French-speaking locales around the world. Each country in the test had the site translated by a native of that country. The test was run from November 30 to December 4, 2015. Users in that time were split randomly into a test and control group. The test group were shown these new localized translations, while those in the control group were shown the old translation by the native of France. 

The preliminary results of this test are surprising: the control group has a higher conversion rate. At face value, this would imply the localized translation are having the opposite effect as anticipated. Further analysis shows this not to be the case, however. The discrepancy is a result of both lower overall conversion rates in Algeria and Mauritius and a higher proportion of users in test group in those countries. After excising Algeria and Mauritius, the conversion rate is the same for the control and test group, showing the localized translations have no effect.

## Analysis
The cause of the discrepancy between the conversion rates of the control and test groups has to do with differences between the populations in different countries. I have summarized the Conversion rates (fraction of users that were converted) and Test Fraction (fraction of users in the test group) below:

In [1]:
from SambaTranslation import *
T = TranslationTest()

The minimum supported version is 2.4.6



In [2]:
df_table = pd.DataFrame({'Test Fraction':
            T.df.groupby('country')['test'].sum()*1./
                         T.df.groupby('country')['test'].count(),
        'Conversion Rate': T.df.groupby('country')['conversion'].sum()*1./
                         T.df.groupby('country')['conversion'].count()})
df_table.sort_values('Conversion Rate')

Unnamed: 0_level_0,Conversion Rate,Test Fraction
country,Unnamed: 1_level_1,Unnamed: 2_level_1
Mauritius,0.012821,0.899613
Algeria,0.013994,0.799799
Seychelles,0.048089,0.502404
Haiti,0.048634,0.501079
Andorra,0.048863,0.503199
Belgium,0.049072,0.494432
Senegal,0.049253,0.491013
Tunisia,0.049653,0.496066
Switzerland,0.049666,0.496194
Morocco,0.049704,0.500785


The majority of countries have a conversion rate of approximately 5%. Excluding France, Algeria, and Mauritius, the  conversion rates range from 5.35% to 4.81%. The conversion rates from Algeria and Mauritius are significantly lower, at 1.4% and 1.3%. While the rest of the countries were split evenly between the test and control groups, 80% of users in Algeria and Mauritius were placed into the test group. 

The imbalance in both the conversion rates and test fractions for Algeria and Mauritius creates an imbalance between the test and control groups. The users from these two countries are driving down the average conversion rate. But, they are driving it down more in the test group, since there are 4 times as many users from those countries in the test group compared to the control group, whereas users from other countries are evenly split between them. 

This imbalance is not the only issue affecting the conversion rate. When splitting up the sample by country, some, such as Morocco, actually have a higher test conversion rate. This appears to be largely a function of the browser language however:

In [3]:
T.naive_test(df = T.df[T.df.country == 'Morocco'], iterate_on = 'browser_language')


browser_language = EN:
   Test Conversion Rate: 6.12% +/- 0.64%
Control Conversion Rate: 4.65% +/- 0.58%

browser_language = FR:
   Test Conversion Rate: 4.97% +/- 0.24%
Control Conversion Rate: 4.83% +/- 0.24%

browser_language = Other:
   Test Conversion Rate: 4.79% +/- 1.25%
Control Conversion Rate: 5.02% +/- 1.31%

Full Test:
   Test Conversion Rate: 5.13% +/- 0.22%
Control Conversion Rate: 4.81% +/- 0.22%


The results for Morocco seem to be influenced by users using their browser in English, since those using French have no significant difference in conversion rate between test and control groups. If the browser is in English, users are probably not viewing the localized French translations, so these users may be skewing results. 

After excising users from Algeria and Mauritius and only including entries where the browser language is English, there is no difference in conversion rate between the Test and Control groups:

In [4]:
T.df_fr = T.df[T.df.browser_language == 'FR']
T.naive_test(df = T.df_fr[np.in1d(T.df_fr.country,['Algeria','Mauritius'], 
                                  invert=True)], excludeFrance=True)

   Test Conversion Rate: 5.05% +/- 0.06%
Control Conversion Rate: 5.05% +/- 0.06%


## Fixing the Problem
Nominally, the problem is in the distribution of users into test and control groups. In the existing case, when the sample was subdivided by country, some subdivisions had overall conversion rates significantly different from the rest of the sample. This would be acceptable if the fraction of users in the test group was the same between all subdivisions, but this also wasn't true. 

To identify if this problem is ocurring in the future, one can first locate any subdivisions that are outliers in overall conversion rate. If there are outliers, one can then perform a $\chi^2$ test to test whether or not the fraction of users in the test group is the same among all subdivisions. If not, then the same problem exists.

To fix the problem, subdivisions with outliers in overall conversion rate can be excised, as was done above. Alternatively, all subdivisions can be independently resampled to have the same fraction of users in the test group. 

In [7]:
T.is_there_an_outlier(subdivide_col = 'country', excludeFrance = True, outlier_thresh = 3)

True

In [6]:
import scipy
scipy.stats.chisquare(df_table['Conversion Rate'].median(),
                      df_table['Conversion Rate'].values)

Power_divergenceResult(statistic=0.20887622618688559, pvalue=0.99999999999968003)