# Evaluating Piotroski F-score on ADRs: Alpha Generation versus Classification Accuracy

#### <b><span style="color:red">TRANSPARENCY AND REPRODUCIBILITY NOTICE:</span>
<b><span style="color:red">Due to potential inconsistency of data from unsuccessful fetches, delisting, and limitations in availability and accuracy mentioned in Yahoo Terms of Service [3], pre-downloaded data from January 18, 2024 data pull are saved and used in Section 5 in order to reproduce the results </span><span style="text-decoration:underline;color:red">for consistency throughout the research</span><span style="color:red"><span style="text-decoration:bold;color:red">. The Python scripts in Section 2 through Section 4 all use </span><span style="text-decoration:underline;color:red">live data for transparency on logic</span>. While there are potential differences between the results from live data and the January 18, 2024 data, there is no significant difference in the general and overall results of the research.</span></b>

## 1. Identify the list of foreign ADRs and extract it for use

The stock universe used in this research is from **adr.com**, a  website owned by J.P. Morgan that houses information on depositary receipts including ADR stocks [1]. From this data, the list of ADR stocks and their ticker names are identified. The list of ADRs  extracted from **adr.com** is saved at: https://github.com/rexlaboratory/adr-piotroski-f-score/tree/main/adr-universe.

## 2. Retrieve financial data using YahooQuery

Using the Yahoo Query Python package (primarily), the financial information of the ADR stocks are retrieved from Yahoo Finance data.

In [1]:
# Load Libraries

import pandas as pd
import numpy as np
import os
import yahooquery as yq
from yahooquery import Ticker
import concurrent.futures
from concurrent.futures import ThreadPoolExecutor
from tqdm import tqdm
import yfinance as yf
import datetime
from datetime import datetime, timedelta
import matplotlib.pyplot as plt
import concurrent
import logging
import pandas_datareader
from pandas_datareader import data as pdr
from IPython.display import display
import jinja2
import warnings
from sklearn.metrics import matthews_corrcoef
warnings.filterwarnings("ignore") # Ignore warnings

In [2]:
# Specify the file containing the ADR list
file_path = './adr-universe/dr_universe.xlsx'

# Read data from the Excel file into a DataFrame
df_adr = pd.read_excel(file_path)

In [3]:
# Check how many records are there in df_adr
len(df_adr)

2498

In [4]:
# Filter to include ADRs only
df_adr = df_adr[df_adr['Type'] == 'ADR']

In [5]:
# Check how many records remain after filtering
len(df_adr)

2177

In [6]:
# Store ADR ticker names in a list
tickers_list = df_adr['Symbol'].tolist()

In [7]:
# Columns to retrieve from Yahoo Finance via Yahoo Query
columns_to_extract = [
    'asOfDate', 'periodType', 'NetIncome', 'GrossProfit', 'PretaxIncome',
    'TotalRevenue', 'LongTermDebt', 'LongTermDebtAndCapitalLeaseObligation',
    'TotalAssets', 'CurrentAssets', 'CurrentLiabilities', 'OperatingCashFlow',
    'DilutedEPS', 'ShareIssued'
]

# List of tickers
tickers = tickers_list

# Define the function to fetch the data
def fetch_data(ticker):
    try:
        # Get income statement data for the current ticker
        financial_data = Ticker(ticker).all_financial_data()

        # Extract the specified columns (with null values for non-existing columns)
        data_for_ticker = {column: financial_data.get(column) for column in columns_to_extract}

        # Add 'ticker' as a key to the dictionary
        data_for_ticker['ticker'] = ticker

        # Convert the dictionary to a DataFrame and return it
        return pd.DataFrame(data_for_ticker)
    except Exception as e:
        # print(f"Error fetching data for {ticker}: {str(e)}")

        return pd.DataFrame()

In [8]:
# Note: Running this code may take a while to complete. ThreadPool helped gain a slight improvement in run time.

# Parallel processing to fetch the financial data
with ThreadPoolExecutor() as executor, tqdm(total=len(tickers), desc="Fetching Data") as pbar:
    # Fetch data for all tickers concurrently
    df_list = list(executor.map(lambda ticker: (pbar.update(1) or fetch_data(ticker)), tickers))

Fetching Data: 100%|███████████████████████████████████████████████████████████████| 2177/2177 [13:31<00:00,  2.68it/s]


In [9]:
# Filter out unsuccessful fetches (False values)
df_list1 = [df for df in df_list if isinstance(df, pd.DataFrame) and not df.empty]

In [10]:
# Combine individual DataFrames into one DataFrame
df1 = pd.concat(df_list1, ignore_index=True)

In [11]:
# Check the first few records
df1.head()

Unnamed: 0,asOfDate,periodType,NetIncome,GrossProfit,PretaxIncome,TotalRevenue,LongTermDebt,LongTermDebtAndCapitalLeaseObligation,TotalAssets,CurrentAssets,CurrentLiabilities,OperatingCashFlow,DilutedEPS,ShareIssued,ticker
0,2020-12-31,12M,-2709347000.0,1076011000.0,-2581792000.0,4829019000.0,3901053000.0,5234680000.0,19373760000.0,6055607000.0,6121960000.0,714243000.0,-26.82,817170095.0,VNET
1,2021-12-31,12M,500098000.0,1438030000.0,665174000.0,6189801000.0,6481966000.0,9885772000.0,23095040000.0,5324123000.0,5179995000.0,1387922000.0,-2.16,890714046.0,VNET
2,2022-12-31,12M,-775952000.0,1358256000.0,-630455000.0,7065232000.0,8909115000.0,12862040000.0,26948400000.0,7052276000.0,6332085000.0,2440214000.0,-5.22,921495769.0,VNET
3,2020-03-31,12M,214000000.0,,215000000.0,331000000.0,,595000000.0,8567000000.0,,,,0.1105,983472617.0,TGOPY
4,2021-03-31,12M,1855000000.0,,1855000000.0,1966000000.0,,992000000.0,10837000000.0,,,,0.9595,981697581.0,TGOPY


In [12]:
# Check the number of unique tickers fetched
len(df1['ticker'].unique())

998

## 3. Transform data and compute Piotroski F-Scores

In [13]:
# For historical reference, write the cleansed ADR data to an Excel file
df1.to_excel('./output-files/cleansed_adr_list.xlsx', index=False)

In [14]:
# Convert 'asOfDate' to datetime type
df1['asOfDate'] = pd.to_datetime(df1['asOfDate'])

# Create a copy of df1
df1_copy = df1

In [15]:
# Remove duplicate records using both 'asOfDate' and 'ticker'
df1_copy.drop_duplicates(subset=['asOfDate', 'ticker'], keep='first', inplace=True)

# Filter only the tickers with 'asOfDate' values: '2020-12-31', '2021-12-31', '2022-12-31'
target_asofdates = ['2020-12-31', '2021-12-31', '2022-12-31']
df_filtered1 = df1_copy[df1_copy['asOfDate'].isin(target_asofdates)]

# Filter tickers with ALL three 'asOfDate' records: 2020, 2021, 2022. Ensure all three years have record for each ticker.
filtered_tickers = df_filtered1.groupby('ticker')['asOfDate'].transform('nunique') == 3
df_filtered2 = df_filtered1[filtered_tickers]
df_filtered3 = df_filtered2[df_filtered2['asOfDate'].isin(target_asofdates)]

### 3.1 Compute for the Piotroski F-Scores

From the financial information, the following nine questions are answered using a binary logic [2]. For each ADR stock in a given year, one (1) point is earned if the answer to the corresponding question is ‘yes’. Otherwise, zero (0) point is given.

1. Is Net Income positive?
2. Is Operating Cash Flow positive?
3. Is Return on Assets (Net Income divided by Total Assets) higher this year compared to last year?
4. Is Operating Cash Flow greater than Net Income?
5. Is Leverage (Long Term Debt And Capital Lease Obligation divided by Total Assets) lower this year compared to last year? Long Term Debt is used as a proxy in case Long Term Debt And Capital Lease Obligation is not available from Yahoo Finance data.
6. Is Liquidity (Current Assets divided by Current Liabilities) higher this year compared to last year?
7. Is Shares Issued lower this year compared to last year?
8. Is Gross Margin (Gross Profit divided by Total Revenue) higher this year compared to last year? Pretax Income is used as a proxy in case Gross Profit is not available from Yahoo Finance data.
9. Is Asset Turnover (Total Revenue divided by Total Assets) higher this year compared to last year?


In [16]:
# Define a function to calculate Boolean values for each of the Piotroski criteria
# Note: IsLeverage2Improved is used as a proxy for IsLeverage1Improved depending on financial statement format from Yahoo Finance
# Note: IsGrossMargin2Improved is be used as a proxy for IsGrossMargin1Improved depending on financial statement format from Yahoo Finance
def calculate_boolean_values(df):
    df['IsNetIncomePositive'] = (df['NetIncome'] > 0).astype(int)
    df['IsOperatingCashFlowPositive'] = (df['OperatingCashFlow'] > 0).astype(int)
    df['IsROAImproved'] = ((df['NetIncome'] / df['TotalAssets']) > (df.groupby('ticker')['NetIncome'].shift() / df.groupby('ticker')['TotalAssets'].shift())).astype(int)
    df['IsCashFlowGreaterThanNetIncome'] = (df['OperatingCashFlow'] > df['NetIncome']).astype(int)
    df['IsLeverage1Improved'] = ((df['LongTermDebtAndCapitalLeaseObligation'] / df['TotalAssets']) < (df.groupby('ticker')['LongTermDebtAndCapitalLeaseObligation'].shift() / df.groupby('ticker')['TotalAssets'].shift())).astype(int)
    df['IsLeverage2Improved'] = ((df['LongTermDebt'] / df['TotalAssets']) < (df.groupby('ticker')['LongTermDebt'].shift() / df.groupby('ticker')['TotalAssets'].shift())).astype(int)
    df['IsLiquidityImproved'] = ((df['CurrentAssets'] / df['CurrentLiabilities']) > (df.groupby('ticker')['CurrentAssets'].shift() / df.groupby('ticker')['CurrentLiabilities'].shift())).astype(int)
    df['IsShareIssuedReduced'] = ((df['ShareIssued']) < (df.groupby('ticker')['ShareIssued'].shift())).astype(int)
    df['IsGrossMargin1Improved'] = ((df['GrossProfit'] / df['TotalRevenue']) > (df.groupby('ticker')['GrossProfit'].shift() / df.groupby('ticker')['TotalRevenue'].shift())).astype(int)
    df['IsGrossMargin2Improved'] = ((df['PretaxIncome'] / df['TotalRevenue']) > (df.groupby('ticker')['PretaxIncome'].shift() / df.groupby('ticker')['TotalRevenue'].shift())).astype(int)
    df['IsAssetTurnoverImproved'] = ((df['TotalRevenue'] / df['TotalAssets']) > (df.groupby('ticker')['TotalRevenue'].shift() / df.groupby('ticker')['TotalAssets'].shift())).astype(int)

    # Add Fscore column
    df['Fscore'] = (
        df['IsNetIncomePositive'] +
        df['IsOperatingCashFlowPositive'] +
        df['IsROAImproved'] +
        df['IsCashFlowGreaterThanNetIncome'] +
        df[['IsLeverage1Improved', 'IsLeverage2Improved']].max(axis=1) + # Get only the maximum between the two; IsLeverage2Improved is used as a proxy for IsLeverage1Improved
        df['IsLiquidityImproved'] +
        df['IsShareIssuedReduced'] +
        df[['IsGrossMargin1Improved', 'IsGrossMargin2Improved']].max(axis=1) + # Get only the maximum between the two; IsGrossMargin2Improved can be used as a proxy for IsGrossMargin1Improved
        df['IsAssetTurnoverImproved']
    )

In [17]:
# Calculate the Piotroski criteria value using the function defined earlier
calculate_boolean_values(df_filtered3)

In [18]:
# Create two new dataframes for 2021 and 2022 summaries
df_2021 = df_filtered3[df_filtered3['asOfDate'].dt.year == 2021].copy()
df_2022 = df_filtered3[df_filtered3['asOfDate'].dt.year == 2022].copy()

In [19]:
# Check the number of unique tickers in 2021 data
len(df_2021['ticker'].unique())

699

In [20]:
# Check the number of unique tickers in 2022 data
len(df_2022['ticker'].unique())

699

In [21]:
# For historical reference, save the Fscore data to Excel files
fscore_file_path = './output-files/adr_piotroskFscores_2021_2022.xlsx'
df_filtered3.to_excel(fscore_file_path, index=False)

# For historical reference, save the 2021 Fscore data to Excel files
fscore_2021_file_path = './output-files/adr_piotroskFscores_2021.xlsx'
df_2021.to_excel(fscore_2021_file_path, index=False)

# For historical reference, save the 2022 Fscore data to Excel files
fscore_2022_file_path = './output-files/adr_piotroskiFscores_2022.xlsx'
df_2022.to_excel(fscore_2022_file_path, index=False)

In [22]:
# Define a variable 'basket_df' as the stock basket
basket_df = df_2021

# Define the start dates and end dates range for downloading the ADR stock price (close price)
start_date_2022 = '2022-01-03'
start_date_2022_1 = '2022-01-04'

end_date_2022 = '2022-12-30'
end_date_2022_1 = '2022-12-31'

start_date_2023 = '2023-01-03'
start_date_2023_1 = '2023-01-04'

end_date_2023 = '2023-12-29'
end_date_2023_1 = '2023-12-30'

In [23]:
# Note: Fetching the daily stock price data for 2-year range for hundreds of stocks may take a while.

# Fetch adjusted close prices at the defined date range
start_prices_2022 = yf.download(basket_df['ticker'].tolist(), start = start_date_2022, end = start_date_2022_1)['Close']
end_prices_2022 = yf.download(basket_df['ticker'].tolist(), start = end_date_2022, end = end_date_2022_1)['Close']
start_prices_2023 = yf.download(basket_df['ticker'].tolist(), start = start_date_2023, end = start_date_2023_1)['Close']
end_prices_2023 = yf.download(basket_df['ticker'].tolist(), start = end_date_2023, end = end_date_2023_1)['Close']

[*********************100%***********************]  699 of 699 completed

41 Failed downloads:
- SIGCY: Data doesn't exist for startDate = 1641186000, endDate = 1641272400
- CTNGY: Data doesn't exist for startDate = 1641186000, endDate = 1641272400
- BKFKY: Data doesn't exist for startDate = 1641186000, endDate = 1641272400
- ZHHJY: Data doesn't exist for startDate = 1641186000, endDate = 1641272400
- THGHY: Data doesn't exist for startDate = 1641186000, endDate = 1641272400
- GBXXY: Data doesn't exist for startDate = 1641186000, endDate = 1641272400
- HLN: Data doesn't exist for startDate = 1641186000, endDate = 1641272400
- IMIUY: Data doesn't exist for startDate = 1641186000, endDate = 1641272400
- SHANY: Data doesn't exist for startDate = 1641186000, endDate = 1641272400
- TAGYY: Data doesn't exist for startDate = 1641186000, endDate = 1641272400
- ABHBY: Data doesn't exist for startDate = 1641186000, endDate = 1641272400
- TBABY: Data doesn't exist for startDate = 1641186000, endD

Some of the stocks (e.g., delisted stocks in a particular year) do not have price data. For example, in 2022 there are 50 stocks (out of 742) without price data, while we have complete price data for 692 stocks.

In [24]:
# For historical reference, save the successfully retrieved ADR stock price data to Excel files

start_prices_2022.to_excel('./output-files/start_prices_2022.xlsx', index=False)
end_prices_2022.to_excel('./output-files/end_prices_2022.xlsx', index=False)
start_prices_2023.to_excel('./output-files/start_prices_2023.xlsx', index=False)
end_prices_2023.to_excel('./output-files/end_prices_2023.xlsx', index=False)

In [25]:
# Retrieve the Fscore Excel data processed earlier

fscore_file_path = './output-files/adr_piotroskFscores_2021_2022.xlsx'
fscore_2021_file_path = './output-files/adr_piotroskFscores_2021.xlsx'
fscore_2022_file_path = './output-files/adr_piotroskiFscores_2022.xlsx'
adr_universe_file_path = './adr-universe/dr_universe.xlsx'

df_filtered3 = pd.read_excel(fscore_file_path)
df_2021 = pd.read_excel(fscore_2021_file_path)
df_2022 = pd.read_excel(fscore_2022_file_path)
df_adr_ref = pd.read_excel(adr_universe_file_path)

In [26]:
# Retrieve the stock price data processed earlier

start_prices_2022 = pd.read_excel('./output-files/start_prices_2022.xlsx')
end_prices_2022 = pd.read_excel('./output-files/end_prices_2022.xlsx')
start_prices_2023 = pd.read_excel('./output-files/start_prices_2023.xlsx')
end_prices_2023 = pd.read_excel('./output-files/end_prices_2023.xlsx')

In [27]:
# Merge the Fscore and Stock Price data tables with df_adr_ref to lookup the Region and Country

df_filtered3 = pd.merge(df_filtered3, df_adr_ref[['Symbol', 'Region', 'Country']], left_on='ticker', right_on='Symbol', how='left')
df_filtered3 = df_filtered3.drop('Symbol', axis=1)

df_2021 = pd.merge(df_2021, df_adr_ref[['Symbol', 'Region', 'Country']], left_on='ticker', right_on='Symbol', how='left')
df_2021 = df_2021.drop('Symbol', axis=1)

df_2022 = pd.merge(df_2022, df_adr_ref[['Symbol', 'Region', 'Country']], left_on='ticker', right_on='Symbol', how='left')
df_2022 = df_2022.drop('Symbol', axis=1)

In [28]:
# Merge the stock data tables

end_prices_2022_a = pd.DataFrame(end_prices_2022.iloc[0])
end_prices_2023_a = pd.DataFrame(end_prices_2023.iloc[0])
start_prices_2022_a = pd.DataFrame(start_prices_2022.iloc[0])
start_prices_2023_a = pd.DataFrame(start_prices_2023.iloc[0])

# Merge start and end prices for each year for computation of returns
merged_2022 = pd.merge(start_prices_2022_a, end_prices_2022_a, left_index=True, right_index=True, how='inner')
merged_2023 = pd.merge(start_prices_2023_a, end_prices_2023_a, left_index=True, right_index=True, how='inner')

# Compute percentage returns
merged_2022['returns_2022'] = (merged_2022['0_y'] / merged_2022['0_x']) - 1
merged_2023['returns_2023'] = (merged_2023['0_y'] / merged_2023['0_x']) - 1

# Rename columns
merged_2022.rename(columns={'0_x': 'start_price', '0_y': 'end_price'}, inplace=True)
merged_2023.rename(columns={'0_x': 'start_price', '0_y': 'end_price'}, inplace=True)

# Remove the 'ticker' as index and add as a normal column instead
merged_2022.reset_index(inplace=True)
merged_2023.reset_index(inplace=True)
merged_2022.rename(columns={'index': 'ticker'}, inplace=True)
merged_2023.rename(columns={'index': 'ticker'}, inplace=True)

# Get Fscore, Country and Region data thru joining
merged_2022 = pd.merge(merged_2022, df_2021[['ticker', 'Fscore', 'Region', 'Country']], left_on='ticker', right_on='ticker', how='left')
merged_2023 = pd.merge(merged_2023, df_2022[['ticker', 'Fscore', 'Region', 'Country']], left_on='ticker', right_on='ticker', how='left')

# Create a merged dataframe with combined 2022 and 2023 data
merged_2022_2023 = pd.merge(merged_2022, merged_2023, on='ticker', how='inner')
merged_2022_2023 = merged_2022_2023.dropna() # drop records with NA values

del merged_2022_2023['Country_y'] # delete column, not needed
del merged_2022_2023['Region_y'] # delete column, not needed

merged_2022_2023.rename(columns={'start_price_x': 'start_price_2022',
                                 'end_price_x': 'end_price_2022',
                                 'Fscore_x': 'Fscore_2022',
                                 'Region_x': 'Region',
                                 'Country_x': 'Country',
                                 'start_price_y': 'start_price_2023',
                                 'end_price_y': 'end_price_2023',
                                 'Fscore_y': 'Fscore_2023',
                                },
                        inplace=True) # rename columns

In [29]:
# For historical reference, save the compiled data to an Excel file
merged_2022_2023.to_excel('./output-files/compiled_stock_level_info.xlsx', index=False)

# View the returns data - first few records
print(f"\n{'Table 1. First Few Records of Returns Data'}\n")
print(merged_2022_2023.head())


Table 1. First Few Records of Returns Data

  ticker  start_price_2022  end_price_2022  returns_2022  Fscore_2022  \
0  AAALY         33.450001       34.650002      0.035874            7   
1  AACAY          3.900000        2.200000     -0.435897            5   
2  AAVMY         14.830000       13.820000     -0.068105            6   
3   ABEV          2.720000        2.720000      0.000000            6   
5  ABTZY         10.485000        9.730000     -0.072008            7   

          Region      Country  start_price_2023  end_price_2023  returns_2023  \
0    Dev. Europe      Germany         34.650002       34.650002      0.000000   
1     Emrg. Asia        China          2.240000        2.890000      0.290179   
2    Dev. Europe  Netherlands         14.570000       14.980000      0.028140   
3  Latin America       Brazil          2.590000        2.800000      0.081081   
5     Emrg. Asia  Philippines          9.730000        9.320000     -0.042138   

   Fscore_2023  
0           

For more concise grouping in our summary later, we add Market Group column derived from the data in Region column.

In [30]:
# Add "Market Group" column

merged_2022_2023['Market Group'] = merged_2022_2023['Region'].replace({'Dev. Europe': 'Non-U.S. Developed Markets',
                                     'Dev. Asia': 'Non-U.S. Developed Markets',
                                     'Emrg. Asia': 'Non-U.S. Emerging Markets',
                                     'Emrg. Europe': 'Non-U.S. Emerging Markets',
                                     'Latin America': 'Non-U.S. Emerging Markets',
                                     'Middle East / Africa': 'Non-U.S. Emerging Markets'})

In [31]:
# Number of Countries and ADRs by Market Group

summary_table = pd.pivot_table(merged_2022_2023, values='Country', index='Market Group', aggfunc={'Country': ['count', 'nunique']}, fill_value=0)
summary_table.columns = ['Count of ADR Stocks', 'Count of Countries']
summary_table.loc['Total'] = summary_table.sum()
print(f"\n{'Table 2. Number of Countries and ADR Stocks by Market Group'}\n")
print(summary_table)


Table 2. Number of Countries and ADR Stocks by Market Group

                            Count of ADR Stocks  Count of Countries
Market Group                                                       
Non-U.S. Developed Markets                  402                  23
Non-U.S. Emerging Markets                   255                  25
Total                                       657                  48


## 4. Measure by Market Group. Compute returns and compare by Region: High-F-score stocks VS Low-F-score stocks VS ADR Stock Index

### 4.1  Set up the ADR stock index and compute for the returns
The ADR Stock Index is created using equal-weighted returns of all the ADR stocks in a particular 'market group'. This is used as a baseline when comparing the performance of high-F-score and low-F-score stocks. 

In [32]:
''' Declare what level of information should appear in the rows of the summary tables.
For example, use pivot_row='Region' if summary by region is needed,
use pivot_row='Market Group' if summary by market group is needed, and so on.
'''
pivot_row = 'Market Group'

In [2]:
# Define a function to calculate equal-weighted returns (Index)

def equal_weighted_returns(df):
    return df['returns'].mean().mean()

In [3]:
# Define a function to calculate other metrics such as precision metrics

def calculate_accuracy(df, pivot_row):
    # Create a dictionary to store the results
    summary_dict = {pivot_row: [], 'Expected Winner': [], 'Actual Winner': [], 'Expected Loser': [], 'Actual Loser': [], 'HF Precision': [], 'LF Precision': [], 'Overall Precision': []}
    
    # Calculate and add metrics for each pivot_row
    for value in df[pivot_row].unique():
        value_df = df[df[pivot_row] == value]

        # Expected Winner (EW)
        ew = len(value_df[(value_df['Fscore'] > 6)])

        # Actual Winner (AW)
        aw = len(value_df[(value_df['Fscore'] > 6) & (value_df['returns'] > equal_weighted_returns(value_df))])

        # Expected Loser (EL)
        el = len(value_df[(value_df['Fscore'] < 4)])

        # Actual Loser (AL)
        al = len(value_df[(value_df['Fscore'] < 4) & (value_df['returns'] < equal_weighted_returns(value_df))])

        # Precision
        hf_precision = aw / ew
        lf_precision = al / el
        overall_precision = (aw + al) / (ew + el)

        # Add data to the summary dictionary
        summary_dict[pivot_row].append(value)
        summary_dict['Expected Winner'].append(ew)
        summary_dict['Actual Winner'].append(aw)
        summary_dict['Expected Loser'].append(el)
        summary_dict['Actual Loser'].append(al)
        summary_dict['HF Precision'].append(hf_precision)
        summary_dict['LF Precision'].append(lf_precision)
        summary_dict['Overall Precision'].append(overall_precision)

    # Convert lists to Pandas Series before calculating sums
    expected_winner_series = pd.Series(summary_dict['Expected Winner'])
    actual_winner_series = pd.Series(summary_dict['Actual Winner'])
    expected_loser_series = pd.Series(summary_dict['Expected Loser'])
    actual_loser_series = pd.Series(summary_dict['Actual Loser'])

    # Calculate overall metrics for the entire dataset
    overall_ew = expected_winner_series.sum()
    overall_aw = actual_winner_series.sum()
    overall_el = expected_loser_series.sum()
    overall_al = actual_loser_series.sum()
    overall_hf_precision = overall_aw / overall_ew
    overall_lf_precision = overall_al / overall_el
    overall_overall_precision = (overall_aw + overall_al) / (overall_ew + overall_el)

    # Add 'Overall' row to the summary dictionary
    summary_dict[pivot_row].append('Overall')
    summary_dict['Expected Winner'].append(overall_ew)
    summary_dict['Actual Winner'].append(overall_aw)
    summary_dict['Expected Loser'].append(overall_el)
    summary_dict['Actual Loser'].append(overall_al)
    summary_dict['HF Precision'].append(overall_hf_precision)
    summary_dict['LF Precision'].append(overall_lf_precision)
    summary_dict['Overall Precision'].append(overall_overall_precision)

    # Create a DataFrame from the summary dictionary
    summary = pd.DataFrame(summary_dict)
    
    return summary

In [35]:
# Calculate the 2022 Returns by Market Group
pivot_row = 'Market Group'

# Save 2022 data as 'df'
columns_selected = ['ticker', 'returns_2022','Fscore_2022', 'Market Group']
df = merged_2022_2023[columns_selected]
df.rename(columns={'returns_2022': 'returns',
                   'Fscore_2022': 'Fscore'
                  }, inplace=True) # rename column

# Create a dictionary to store the results
summary_dict = {pivot_row: [], 'Index_Returns': [], 'Low_Fscore_Returns': [], 'High_Fscore_Returns': []}

# Calculate and add returns for Fscore categories and by pivot_row
for value in df[pivot_row].unique():
    value_df = df[df[pivot_row] == value]
    
    # Equal-weighted returns for the region
    index_returns = equal_weighted_returns(value_df)
    
    # Equal-weighted returns for Fscore categories
    fscore_0_3_returns = equal_weighted_returns(value_df[(value_df['Fscore'] >= 0) & (value_df['Fscore'] <= 3)])
    fscore_4_6_returns = equal_weighted_returns(value_df[(value_df['Fscore'] >= 4) & (value_df['Fscore'] <= 6)])
    fscore_7_9_returns = equal_weighted_returns(value_df[(value_df['Fscore'] >= 7) & (value_df['Fscore'] <= 9)])
 
    # Add data to the summary dictionary
    summary_dict[pivot_row].append(value)
    summary_dict['Index_Returns'].append(index_returns)
    summary_dict['Low_Fscore_Returns'].append(fscore_0_3_returns)
    summary_dict['High_Fscore_Returns'].append(fscore_7_9_returns)
    
# Calculate overall returns for the entire dataset (without considering region) for each column
overall_returns_index = equal_weighted_returns(df)

# Weighted returns for all stocks in the dataset (without considering region)
overall_returns_low_fscore = equal_weighted_returns(df[(df['Fscore'] >= 0) & (df['Fscore'] <= 3)])
overall_returns_high_fscore = equal_weighted_returns(df[df['Fscore'] >= 7])

# Add data to the summary dictionary for overall returns
summary_dict[pivot_row].append('Overall')
summary_dict['Index_Returns'].append(overall_returns_index)
summary_dict['Low_Fscore_Returns'].append(overall_returns_low_fscore)
summary_dict['High_Fscore_Returns'].append(overall_returns_high_fscore)
    
# Create a DataFrame from the summary dictionary
summary_df1 = pd.DataFrame(summary_dict)

In [36]:
# Calculate the other metrics for 2022 by Region: accuracy, specificity, sensitivity, and balanced accuracy
summary_df1_metrics = calculate_accuracy(df, pivot_row)

In [37]:
# Calculate the 2023 Returns by Region

# Save 2023 data as 'df'
columns_selected = ['ticker', 'returns_2023','Fscore_2023', 'Market Group']
df = merged_2022_2023[columns_selected]
df.rename(columns={'returns_2023': 'returns',
                   'Fscore_2023': 'Fscore'
                  }, inplace=True) # rename column

# Create a dictionary to store the results
summary_dict = {pivot_row: [], 'Index_Returns': [], 'Low_Fscore_Returns': [], 'High_Fscore_Returns': []}

# Calculate and add returns for Fscore categories and by pivot_row
for value in df[pivot_row].unique():
    value_df = df[df[pivot_row] == value]
    
    # Equal-weighted returns for the pivot_row
    index_returns = equal_weighted_returns(value_df)
    
    # Equal-weighted returns for Fscore categories
    fscore_0_3_returns = equal_weighted_returns(value_df[(value_df['Fscore'] >= 0) & (value_df['Fscore'] <= 3)])
    fscore_4_6_returns = equal_weighted_returns(value_df[(value_df['Fscore'] >= 4) & (value_df['Fscore'] <= 6)])
    fscore_7_9_returns = equal_weighted_returns(value_df[(value_df['Fscore'] >= 7) & (value_df['Fscore'] <= 9)])
      
    # Add data to the summary dictionary
    summary_dict[pivot_row].append(value)
    summary_dict['Index_Returns'].append(index_returns)
    summary_dict['Low_Fscore_Returns'].append(fscore_0_3_returns)
    summary_dict['High_Fscore_Returns'].append(fscore_7_9_returns)
    
# Calculate overall returns for the entire dataset (without considering region) for each column
overall_returns_index = equal_weighted_returns(df)

# Weighted returns for all stocks in the dataset (without considering region)
overall_returns_low_fscore = equal_weighted_returns(df[(df['Fscore'] >= 0) & (df['Fscore'] <= 3)])
overall_returns_high_fscore = equal_weighted_returns(df[df['Fscore'] >= 7])

# Add data to the summary dictionary for overall returns
summary_dict[pivot_row].append('Overall')
summary_dict['Index_Returns'].append(overall_returns_index)
summary_dict['Low_Fscore_Returns'].append(overall_returns_low_fscore)
summary_dict['High_Fscore_Returns'].append(overall_returns_high_fscore)

# Create a DataFrame from the summary dictionary
summary_df2 = pd.DataFrame(summary_dict)

In [38]:
# Calculate the other metrics for 2023 by Region: accuracy, specificity, sensitivity, and balanced accuracy
summary_df2_metrics = calculate_accuracy(df, pivot_row)

### 4.2  Compare the results: High-F-score stocks VS Low-F-score stocks VS ADR Stock Index by region

In [62]:
# Print the 2022 returns summary
print(f"\n{'Table 3. 2022 Returns Summary by Market Group'}\n")
print(summary_df1)


Table 3. 2022 Returns Summary by Market Group

                 Market Group  Index_Returns  Low_Fscore_Returns  \
0  Non-U.S. Developed Markets      -0.174058           -0.286524   
1   Non-U.S. Emerging Markets      -0.086448           -0.218753   
2                     Overall      -0.141157           -0.256965   

   High_Fscore_Returns  
0            -0.143242  
1            -0.073655  
2            -0.122183  


Based on the results for 2022 in Table 3, the overall performance of high-F-score stocks is better than both the benchmark and the low-F-score stock returns. High-F-score stocks outperform the low-F-score stocks by 13.48% and the benchmark by 1.90%. Low-F-score stocks underperform by 11.58% below the benchmark.

In [63]:
# Print the 2023 returns summary
print(f"\n{'Table 4. 2023 Returns Summary by Market Group'}\n")
print(summary_df2)


Table 4. 2023 Returns Summary by Market Group

                 Market Group  Index_Returns  Low_Fscore_Returns  \
0  Non-U.S. Developed Markets       0.059156           -0.013127   
1   Non-U.S. Emerging Markets       0.074427            0.095927   
2                     Overall       0.064891            0.031204   

   High_Fscore_Returns  
0             0.135085  
1             0.163673  
2             0.143644  


Results for 2023 in Table 4 show again that high-F-score stocks perform better than both the benchmark and the low-F-score stocks. High-F-score stocks outperform the low-F-score stocks by 11.24% and the benchmark by 7.88%. Low-F-score stocks from Non-U.S. Emerging Markets outperform the benchmark by 2.15% while low-F-score stocks from Non-U.S. Developed Markets underperform by 7.23% below the benchmark. Overall, low-F-score stocks underperform by 3.37% below the benchmark.


In [64]:
# Print the 2022 Accuracy Metrics
print(f"\n{'Table 5. 2022 Metrics Summary by Market Group'}\n")
print(summary_df1_metrics)


Table 5. 2022 Metrics Summary by Market Group

                 Market Group  Expected Winner  Actual Winner  Expected Loser  \
0  Non-U.S. Developed Markets              159             85              53   
1   Non-U.S. Emerging Markets               69             32              41   
2                     Overall              228            117              94   

   Actual Loser  HF Precision  LF Precision  Overall Precision  
0            33      0.534591      0.622642           0.556604  
1            25      0.463768      0.609756           0.518182  
2            58      0.513158      0.617021           0.543478  


In [65]:
# Print the 2023 Accuracy Metrics
print(f"\n{'Table 6. 2023 Metrics Summary by Market Group'}\n")
print(summary_df2_metrics)


Table 6. 2023 Metrics Summary by Market Group

                 Market Group  Expected Winner  Actual Winner  Expected Loser  \
0  Non-U.S. Developed Markets              117             63              73   
1   Non-U.S. Emerging Markets               50             25              50   
2                     Overall              167             88             123   

   Actual Loser  HF Precision  LF Precision  Overall Precision  
0            50      0.538462      0.684932           0.594737  
1            37      0.500000      0.740000           0.620000  
2            87      0.526946      0.707317           0.603448  


Accuracy metrics are presented in Table B5 and Table B6. All precision metrics for 2023 are higher than those of 2022. The most noteworthy is the increase in Low F-score Precision for Non-U.S. Emerging Markets by about 13% in 2023 – from 60.98% in 2022 to 74.00% in 2023. Consequently, the overall Low F-score Precision increased by about 9% in 2023 – from 61.70% in 2022 to 70.73% in 2023.

## 5. Reproducing the results based on January 18, 2024 original data

Stock prices and financial data are extracted via the publicly available Yahoo Finance APIs and subject to both the availability and accuracy limitations as stated in Section 8 (Warranties and Disclaimers) of the Yahoo Terms of Service [3]. To keep data consistency in this research, only stocks with complete information are included in our stock universe. This ensures that 2021 and 2022 Piotroski F-scores can be calculated and compared against 2022 and 2023 returns, respectively.

Scripts in Section 2 through Section 4 all use live data. Due to potential inconsistency of data from unsuccessful fetches, delisting, and the availability and accuracy limitations mentioned above, pre-downloaded data from January 18, 2024 pull are saved and used in order to reproduce the results in this section. This does not change the general and overall results of the research.

### 5.1  Retrieve pre-downloaded data
Retrieve the January 18, 2024 pre-downloaded data.

In [4]:
# Retrieve the Fscore Excel data processed on January 18, 2024

fscore_file_path = './output-files/01-18-2024/adr_piotroskFscores_2021_2022.xlsx'
fscore_2021_file_path = './output-files/01-18-2024/adr_piotroskFscores_2021.xlsx'
fscore_2022_file_path = './output-files/01-18-2024/adr_piotroskiFscores_2022.xlsx'
adr_universe_file_path = './adr-universe/dr_universe.xlsx'

df_filtered3 = pd.read_excel(fscore_file_path)
df_2021 = pd.read_excel(fscore_2021_file_path)
df_2022 = pd.read_excel(fscore_2022_file_path)
df_adr_ref = pd.read_excel(adr_universe_file_path)

In [5]:
# Retrieve the stock price data processed on January 18, 2024

start_prices_2022 = pd.read_excel('./output-files/01-18-2024/start_prices_2022.xlsx')
end_prices_2022 = pd.read_excel('./output-files/01-18-2024/end_prices_2022.xlsx')
start_prices_2023 = pd.read_excel('./output-files/01-18-2024/start_prices_2023.xlsx')
end_prices_2023 = pd.read_excel('./output-files/01-18-2024/end_prices_2023.xlsx')

In [6]:
# Merge the Fscore and Stock Price data tables with df_adr_ref to lookup the Region and Country (data as of January 18, 2024)

df_filtered3a = pd.merge(df_filtered3, df_adr_ref[['Symbol', 'Region', 'Country']], left_on='ticker', right_on='Symbol', how='left')
df_filtered3 = df_filtered3a.drop('Symbol', axis=1)

df_2021 = pd.merge(df_2021, df_adr_ref[['Symbol', 'Region', 'Country']], left_on='ticker', right_on='Symbol', how='left')
df_2021 = df_2021.drop('Symbol', axis=1)

df_2022 = pd.merge(df_2022, df_adr_ref[['Symbol', 'Region', 'Country']], left_on='ticker', right_on='Symbol', how='left')
df_2022 = df_2022.drop('Symbol', axis=1)

In [7]:
# Merge the stock data tables (data as of January 18, 2024)

end_prices_2022_b = pd.DataFrame(end_prices_2022.iloc[0])
end_prices_2023_b = pd.DataFrame(end_prices_2023.iloc[0])
start_prices_2022_b = pd.DataFrame(start_prices_2022.iloc[0])
start_prices_2023_b = pd.DataFrame(start_prices_2023.iloc[0])

# Merge start and end prices for each year for computation of returns
merged_2022 = pd.merge(start_prices_2022_b, end_prices_2022_b, left_index=True, right_index=True, how='inner')
merged_2023 = pd.merge(start_prices_2023_b, end_prices_2023_b, left_index=True, right_index=True, how='inner')

# Compute percentage returns
merged_2022['returns_2022'] = (merged_2022['0_y'] / merged_2022['0_x']) - 1
merged_2023['returns_2023'] = (merged_2023['0_y'] / merged_2023['0_x']) - 1

# Rename columns
merged_2022.rename(columns={'0_x': 'start_price', '0_y': 'end_price'}, inplace=True)
merged_2023.rename(columns={'0_x': 'start_price', '0_y': 'end_price'}, inplace=True)

# Remove the 'ticker' as index and add as a normal column instead
merged_2022.reset_index(inplace=True)
merged_2023.reset_index(inplace=True)
merged_2022.rename(columns={'index': 'ticker'}, inplace=True)
merged_2023.rename(columns={'index': 'ticker'}, inplace=True)

# Get Fscore, Country and Region data thru joining
merged_2022 = pd.merge(merged_2022, df_2021[['ticker', 'Fscore', 'Region', 'Country']], left_on='ticker', right_on='ticker', how='left')
merged_2023 = pd.merge(merged_2023, df_2022[['ticker', 'Fscore', 'Region', 'Country']], left_on='ticker', right_on='ticker', how='left')

# Create a merged dataframe with combined 2022 and 2023 data
merged_2022_2023 = pd.merge(merged_2022, merged_2023, on='ticker', how='inner')
merged_2022_2023 = merged_2022_2023.dropna() # drop records with NA values

del merged_2022_2023['Country_y'] # delete column, not needed
del merged_2022_2023['Region_y'] # delete column, not needed

merged_2022_2023.rename(columns={'start_price_x': 'start_price_2022',
                                 'end_price_x': 'end_price_2022',
                                 'Fscore_x': 'Fscore_2022',
                                 'Region_x': 'Region',
                                 'Country_x': 'Country',
                                 'start_price_y': 'start_price_2023',
                                 'end_price_y': 'end_price_2023',
                                 'Fscore_y': 'Fscore_2023',
                                },
                        inplace=True) # rename columns

In [8]:
# View the returns data - first few records (data as of January 18, 2024)
print(f"\n{'Table B1. First Few Records of Returns Data'}\n")
print(merged_2022_2023.head())


Table B1. First Few Records of Returns Data

  ticker  start_price_2022  end_price_2022  returns_2022  Fscore_2022  \
0  AAALY         33.450001       34.650002      0.035874            7   
1  AACAY          3.900000        2.200000     -0.435897            5   
2  AAGIY         40.650002       44.430000      0.092989            4   
3  AAVMY         14.830000       13.820000     -0.068105            6   
4  ABDBY          3.550000        4.040000      0.138028            0   

        Region      Country  start_price_2023  end_price_2023  returns_2023  \
0  Dev. Europe      Germany         34.650002       34.650002      0.000000   
1   Emrg. Asia        China          2.240000        2.890000      0.290179   
2    Dev. Asia    Hong Kong         45.799999       34.669998     -0.243013   
3  Dev. Europe  Netherlands         14.570000       14.980000      0.028140   
4  Dev. Europe      Denmark          4.040000        4.040000      0.000000   

   Fscore_2023  
0            5  
1     

For more concise grouping in our summary later, we add Market Group column derived from the data in Region column.

In [9]:
# Add "Market Group" column (data as of January 18, 2024)
merged_2022_2023['Market Group'] = merged_2022_2023['Region'].replace({'Dev. Europe': 'Non-U.S. Developed Markets',
                                     'Dev. Asia': 'Non-U.S. Developed Markets',
                                     'Emrg. Asia': 'Non-U.S. Emerging Markets',
                                     'Emrg. Europe': 'Non-U.S. Emerging Markets',
                                     'Latin America': 'Non-U.S. Emerging Markets',
                                     'Middle East / Africa': 'Non-U.S. Emerging Markets'})

# Create summary of number of Countries and ADRs by Market Group (data as of January 18, 2024)
summary_table = pd.pivot_table(merged_2022_2023, values='Country', index='Market Group', aggfunc={'Country': ['count', 'nunique']}, fill_value=0)
summary_table.columns = ['Count of ADR Stocks', 'Count of Countries']
summary_table.loc['Total'] = summary_table.sum()
print(f"\n{'Table B2. Number of Countries and ADR Stocks by Market Group'}\n")
print(summary_table)


Table B2. Number of Countries and ADR Stocks by Market Group

                            Count of ADR Stocks  Count of Countries
Market Group                                                       
Non-U.S. Developed Markets                  429                  24
Non-U.S. Emerging Markets                   258                  25
Total                                       687                  49


### 5.2  Set up the ADR stock index and compute for the returns
The ADR Stock Index is created using equal-weighted returns of all the ADR stocks in a particular 'market group'. This is used as a baseline when comparing the performance of high-F-score and low-F-score stocks. 

In [18]:
# Calculate the 2022 Returns by Market Group (data as of January 18, 2024)
pivot_row = 'Market Group'

# Save 2022 data as 'df'
columns_selected = ['ticker', 'returns_2022','Fscore_2022', 'Market Group']
df = merged_2022_2023[columns_selected]
df.rename(columns={'returns_2022': 'returns',
                   'Fscore_2022': 'Fscore'
                  }, inplace=True) # rename column

# Create a dictionary to store the results
summary_dict = {pivot_row: [], 'Index_Returns': [], 'Low_Fscore_Returns': [], 'High_Fscore_Returns': []}

# Calculate and add returns for Fscore categories and by pivot_row
for value in df[pivot_row].unique():
    value_df = df[df[pivot_row] == value]
    
    # Equal-weighted returns for the region
    index_returns = equal_weighted_returns(value_df)
    
    # Equal-weighted returns for Fscore categories
    fscore_0_3_returns = equal_weighted_returns(value_df[(value_df['Fscore'] >= 0) & (value_df['Fscore'] <= 3)])
    fscore_4_6_returns = equal_weighted_returns(value_df[(value_df['Fscore'] >= 4) & (value_df['Fscore'] <= 6)])
    fscore_7_9_returns = equal_weighted_returns(value_df[(value_df['Fscore'] >= 7) & (value_df['Fscore'] <= 9)])
 
    # Add data to the summary dictionary
    summary_dict[pivot_row].append(value)
    summary_dict['Index_Returns'].append(index_returns)
    summary_dict['Low_Fscore_Returns'].append(fscore_0_3_returns)
    summary_dict['High_Fscore_Returns'].append(fscore_7_9_returns)
    
# Calculate overall returns for the entire dataset (without considering region) for each column
overall_returns_index = equal_weighted_returns(df)

# Weighted returns for all stocks in the dataset (without considering region)
overall_returns_low_fscore = equal_weighted_returns(df[(df['Fscore'] >= 0) & (df['Fscore'] <= 3)])
overall_returns_high_fscore = equal_weighted_returns(df[df['Fscore'] >= 7])

# Add data to the summary dictionary for overall returns
summary_dict[pivot_row].append('Overall')
summary_dict['Index_Returns'].append(overall_returns_index)
summary_dict['Low_Fscore_Returns'].append(overall_returns_low_fscore)
summary_dict['High_Fscore_Returns'].append(overall_returns_high_fscore)
    
# Create a DataFrame from the summary dictionary
summary_df1a = pd.DataFrame(summary_dict)

In [19]:
# Calculate the other metrics for 2022 by Region: accuracy, specificity, sensitivity, and balanced accuracy (data as of January 18, 2024)
summary_df1a_metrics = calculate_accuracy(df, pivot_row)

In [20]:
# Calculate the 2023 Returns by Market Group (data as of January 18, 2024)

# Save 2023 data as 'df'
columns_selected = ['ticker', 'returns_2023','Fscore_2023', 'Market Group']
df = merged_2022_2023[columns_selected]
df.rename(columns={'returns_2023': 'returns',
                   'Fscore_2023': 'Fscore'
                  }, inplace=True) # rename column

# Create a dictionary to store the results
summary_dict = {pivot_row: [], 'Index_Returns': [], 'Low_Fscore_Returns': [], 'High_Fscore_Returns': []}

# Calculate and add returns for Fscore categories and by pivot_row
for value in df[pivot_row].unique():
    value_df = df[df[pivot_row] == value]
    
    # Equal-weighted returns for the pivot_row
    index_returns = equal_weighted_returns(value_df)
    
    # Equal-weighted returns for Fscore categories
    fscore_0_3_returns = equal_weighted_returns(value_df[(value_df['Fscore'] >= 0) & (value_df['Fscore'] <= 3)])
    fscore_4_6_returns = equal_weighted_returns(value_df[(value_df['Fscore'] >= 4) & (value_df['Fscore'] <= 6)])
    fscore_7_9_returns = equal_weighted_returns(value_df[(value_df['Fscore'] >= 7) & (value_df['Fscore'] <= 9)])
      
    # Add data to the summary dictionary
    summary_dict[pivot_row].append(value)
    summary_dict['Index_Returns'].append(index_returns)
    summary_dict['Low_Fscore_Returns'].append(fscore_0_3_returns)
    summary_dict['High_Fscore_Returns'].append(fscore_7_9_returns)
    
# Calculate overall returns for the entire dataset (without considering region) for each column
overall_returns_index = equal_weighted_returns(df)

# Weighted returns for all stocks in the dataset (without considering region)
overall_returns_low_fscore = equal_weighted_returns(df[(df['Fscore'] >= 0) & (df['Fscore'] <= 3)])
overall_returns_high_fscore = equal_weighted_returns(df[df['Fscore'] >= 7])

# Add data to the summary dictionary for overall returns
summary_dict[pivot_row].append('Overall')
summary_dict['Index_Returns'].append(overall_returns_index)
summary_dict['Low_Fscore_Returns'].append(overall_returns_low_fscore)
summary_dict['High_Fscore_Returns'].append(overall_returns_high_fscore)

# Create a DataFrame from the summary dictionary
summary_df2a = pd.DataFrame(summary_dict)

In [21]:
# Calculate the other metrics for 2023 by Region: accuracy, specificity, sensitivity, and balanced accuracy (data as of January 18, 2024)
summary_df2a_metrics = calculate_accuracy(df, pivot_row)

### 5.3  Compare the results: High-F-score stocks VS Low-F-score stocks VS ADR Stock Index by region

In [22]:
# Print the 2022 returns summary (data as of January 18, 2024)
print(f"\n{'Table B3. 2022 Returns Summary by Market Group'}\n")
print(summary_df1a)


Table B3. 2022 Returns Summary by Market Group

                 Market Group  Index_Returns  Low_Fscore_Returns  \
0  Non-U.S. Developed Markets      -0.174058           -0.286524   
1   Non-U.S. Emerging Markets      -0.086448           -0.218753   
2                     Overall      -0.141157           -0.256965   

   High_Fscore_Returns  
0            -0.143242  
1            -0.073655  
2            -0.122183  


Based on the results for 2022 in Table 3, the overall performance of high-F-score stocks is better than both the benchmark and the low-F-score stock returns. High-F-score stocks outperform the low-F-score stocks by 13.48% and the benchmark by 1.90%. Low-F-score stocks underperform by 11.58% below the benchmark.

In [23]:
# Print the 2023 returns summary (data as of January 18, 2024)
print(f"\n{'Table B4. 2023 Returns Summary by Market Group'}\n")
print(summary_df2a)


Table B4. 2023 Returns Summary by Market Group

                 Market Group  Index_Returns  Low_Fscore_Returns  \
0  Non-U.S. Developed Markets       0.059156           -0.013127   
1   Non-U.S. Emerging Markets       0.074427            0.095927   
2                     Overall       0.064891            0.031204   

   High_Fscore_Returns  
0             0.135085  
1             0.163673  
2             0.143644  


Results for 2023 in Table 4 show again that high-F-score stocks perform better than both the benchmark and the low-F-score stocks. High-F-score stocks outperform the low-F-score stocks by 11.24% and the benchmark by 7.88%. Low-F-score stocks from Non-U.S. Emerging Markets outperform the benchmark by 2.15% while low-F-score stocks from Non-U.S. Developed Markets underperform by 7.23% below the benchmark. Overall, low-F-score stocks underperform by 3.37% below the benchmark.


In [24]:
# Print the 2022 Accuracy Metrics (data as of January 18, 2024)
print(f"\n{'Table B5. 2022 Metrics Summary by Market Group'}\n")
print(summary_df1a_metrics)


Table B5. 2022 Metrics Summary by Market Group

                 Market Group  Expected Winner  Actual Winner  Expected Loser  \
0  Non-U.S. Developed Markets              159             85              53   
1   Non-U.S. Emerging Markets               69             32              41   
2                     Overall              228            117              94   

   Actual Loser  HF Precision  LF Precision  Overall Precision  
0            33      0.534591      0.622642           0.556604  
1            25      0.463768      0.609756           0.518182  
2            58      0.513158      0.617021           0.543478  


In [25]:
# Print the 2023 Accuracy Metrics (data as of January 18, 2024)
print(f"\n{'Table B6. 2023 Metrics Summary by Market Group'}\n")
print(summary_df2a_metrics)


Table B6. 2023 Metrics Summary by Market Group

                 Market Group  Expected Winner  Actual Winner  Expected Loser  \
0  Non-U.S. Developed Markets              117             63              73   
1   Non-U.S. Emerging Markets               50             25              50   
2                     Overall              167             88             123   

   Actual Loser  HF Precision  LF Precision  Overall Precision  
0            50      0.538462      0.684932           0.594737  
1            37      0.500000      0.740000           0.620000  
2            87      0.526946      0.707317           0.603448  


Accuracy metrics are presented in Table B5 and Table B6. All precision metrics for 2023 are higher than those of 2022. The most noteworthy is the increase in Low F-score Precision for Non-U.S. Emerging Markets by about 13% in 2023 – from 60.98% in 2022 to 74.00% in 2023. Consequently, the overall Low F-score Precision increased by about 9% in 2023 – from 61.70% in 2022 to 70.73% in 2023.

Based on the contrasting performances in benchmark returns as shown in Table B3 and Table B4, 2022 and 2023 can be considered as two periods with different market conditions. 2022 can be considered a period under a ‘bad’ market condition and 2023 a period under a relatively ‘good’ market condition. Disregarding the part of return metrics affected by outliers, the return metrics indicate that the F-score classifier is a better predictor under good market conditions. Precision metrics support this, as shown by the higher precision values in 2023 compared to 2022.

The return-based results in Table B3 and Table B4 show that high F-score stocks outperform low F-score stocks by about 10%, which is consistent between Non-U.S. Emerging Markets and Non-U.S. Developed Markets. High F-score stocks outperform the benchmark by 1.90% in 2022 and by 7.88% in 2023. High F-score stocks from Non-U.S. Emerging Markets outperform the high F-score stocks from Non-U.S. Developed Markets by 1.80% in 2022. On the other hand, high F-score stocks from Non-U.S. Developed Markets outperform the high F-score stocks Non-U.S. Emerging Markets by 1.33% in 2023. The better performance of high-F-score stocks against the benchmark supports the results from previous studies that Piotroski F-score can be an effective tool in generating Alpha.

The precision results show that High F-score Precision (ability to predict winner stocks) is slightly better in Non-U.S. Developed Markets than Non-U.S. Emerging Markets in both years. Low F-score Precision (ability to predict loser stocks) performs better in Non-U.S. Developed Markets in 2022, but then reverses and performs better in Non-U.S. Emerging Markets in 2023. Overall, both returns and precision results are generally uniform between the two market groups especially if both 2022 and 2023 results are considered altogether.

## 6. Conclusion

We have tested Piotroski F-score on ADRs using 687 stocks (429 stocks from developed markets and 258 from emerging markets) from 49 different non-U.S. home countries. Portfolio-level return rates show that high-F-score stocks outperform the low-F-score stocks by about 10% and the benchmark by about 3%. These are consistent or close to the results from the existing literature on Piotroski F-score where such numbers are considered significant. Excluding some outliers, we can generalize that the F-score’s return-based performance is better during good market conditions. The F-score still leads to positive Alpha during bad market conditions but with noticeable decline.

Information loss is a drawback of using portfolio-level returns. To augment our performance metrics, precision is used as a complementary performance measure. Unlike return rates, precision is not affected by outliers. It offers more transparency on the strength and weakness of the F-score in predicting winner and loser stocks. Another noteworthy result presented is that the F-score is better in predicting ‘loser’ stocks than in predicting ‘winner’ stocks. The overall precision values show that F-score is not an exceptional classifier, barely reaching 70% precision. The F-score being ‘not exceptional’ is not a popular opinion or finding among existing literature. In the world of classifiers where accuracy metrics are predominantly used, 70% is not an impressive feat. However, this is not a basis to disregard Piotroski F-score as a stock selection criterion, considering its capability to generate Alpha. Instead, the better strategy is to ensure that it is paired with other methods in order to establish stronger selection criteria.

## References

[1]  J.P. Morgan Depositary Receipts. Depositary Receipt (DR) Universe. J.P. Morgan Chase & Co., 2024. Retrieved from https://adr.com/dr/drdirectory/drUniverse. Accessed 11 January 2024.

[2]  Piotroski, J. D. Value investing: The Use of Historical Financial Statement Information to Separate Winners from Losers. Journal of Accounting Research, 2000. Retrieved from https://gm10b7le2-mp01-y-https-www-proquest-com.proxy.lirn.net/scholarly-journals/value-investing-use-historical-financial/docview/206723328/se-2. Accessed 10 January 2024.

[3]  Yahoo Inc. Yahoo Terms of Service. Yahoo Legal, 2024. Retrieved from https://legal.yahoo.com/us/en/yahoo/terms/otos/index.html. Accessed 17 February 2024.