In [28]:
import pandas as pd
import numpy as np

The sample used in this study contains firms on Vietnam stock exchange from 2002 to 2019. I exclude the following observations from the sample: **utility and financial firms** , **firms with non-positive total assets or sales**, **firms that are not traded on HNX, UPCOM, or HSX**, _firms with share codes other than 10 and 11_, firms with fewer than 100 daily stock price records during a fiscal year, and firms without sufficient data to calculate the control variables described below. In addition, I follow Love et al. (2007) and remove observations that imply trade credit of longer than 1 year. The final sample consists of 129,177 firm-year observations with 13,712 unique firms.

In [2]:
stockprices = pd.read_csv("StockPrices/CafeF.UPCOM.Upto02.10.2020.csv")\
                .append(pd.read_csv("StockPrices/CafeF.HNX.Upto02.10.2020.csv"))\
                .append(pd.read_csv("StockPrices/CafeF.HNX.Upto02.10.2020.csv"))

In [3]:
stockprices.columns=["TICKER","DATE","OPEN","HIGH","LOW","CLOSE","VOLUME"]
stockprices = stockprices[(stockprices.DATE < 20200101) & (stockprices.DATE > 20011231)]
stockprices.drop(columns=["OPEN","HIGH","LOW"],inplace=True)
stockprices.drop_duplicates(inplace=True)
stockprices.reset_index(inplace=True,drop=True)
stockprices.head()

Unnamed: 0,TICKER,DATE,CLOSE,VOLUME
0,ABC,20191231,9.8,300
1,ABI,20191231,30.6968,10300
2,ABR,20191231,11.7,1501
3,ACV,20191231,75.0,81718
4,ADG,20191231,42.2776,18900


In [4]:
companies = pd.read_csv("Companies.csv")
companies.head()

  has_raised = await self.run_ast_nodes(code_ast.body, cell_name,


Unnamed: 0,TICKER,NAME,PRICE,EXCHANGE,URL
0,A32,Công ty cổ phần 32,39.0,Upcom,/upcom/A32-cong-ty-co-phan-32.chn
1,AAA,Công ty Cổ phần Nhựa An Phát Xanh,12.45,HSX,/hose/AAA-cong-ty-co-phan-nhua-an-phat-xanh.chn
2,AAM,Công ty Cổ phần Thủy sản Mekong,11.7,HSX,/hose/AAM-cong-ty-co-phan-thuy-san-mekong.chn
3,AAV,Công ty Cổ phần Việt Tiên Sơn Địa ốc,8.7,HNX,/hastc/AAV-cong-ty-co-phan-viet-tien-son-dia-o...
4,ABC,Công ty cổ phần Truyền thông VMG,13.9,Upcom,/upcom/ABC-cong-ty-co-phan-truyen-thong-vmg.chn


# Liquidity measures
**The illiquidity measure proposed by Amihud (2002) is used as the primary measure of stock liquidity in this study.** This measure is widely employed in the literature and has been demonstrated to be an appropriate proxy for stock illiquidity. For example, Goyenko et al. (2009) document that among 12 proxies that use daily data, the Amihud illiquidity measure most accurately captures price impact. Hasbrouck (2009) shows that, compared to other daily proxies, the Amihud illiquidity measure is the one most strongly correlated with a TAQ-based price impact coefficient. In addition, Fong et al. (2017) find that the Amihud illiquidity measure is the best daily cost-per-dollar-volume proxy. The Amihud illiquidity measure is calculated as the daily ratio of the absolute value of stock returns to dollar volume, averaged over firm i's fiscal year t:

$$\text{Amihud Illiquidity}_{i,t}=\dfrac{1}{D_{i,t}}\sum_{d=1}^{D}\dfrac{\left |\text{Ret}_{t,d}  \right |}{\text{Dollar Volume}_{i,d}}$$

Where $Ret$ and $\text{Dollar Volume}$ are the return and dollar volume of firm i on day d, respectively, and D is the total number of trading days during firm i's fiscal year t.

Since the distribution of the Amihud illiquidity measure is highly skewed, I follow Edmans et al.'s (2013) approach to modify the Amihud illiquidity measure by taking the natural logarithm of (Amihud illiquidity plus one). In addition, for the convenience of interpreting the empirical results, I multiply the modified Amihud illiquidity measure by −1 and name this measure “LiqAM”. Specifically, $LiqAM$ is defined as $–ln(\text{Amihud illiquidity} + 1)$. A higher value of $LiqAM$ is associated with a higher level of stock liquidity.


In [5]:
filtered_stocks = companies["TICKER"].to_frame().merge(stockprices, on='TICKER',how='inner')

In [6]:
filtered_stocks['YEAR'] = filtered_stocks['DATE']//10000
filtered_stocks.head()

Unnamed: 0,TICKER,DATE,CLOSE,VOLUME,YEAR
0,A32,20191224,27.286,400,2019
1,A32,20191223,26.7994,200,2019
2,A32,20191220,26.7994,500,2019
3,A32,20191217,27.7236,100,2019
4,A32,20191126,25.8753,100,2019


In [9]:
lmstat = filtered_stocks[['TICKER','YEAR']].groupby(by=['TICKER','YEAR'])\
                                .size().reset_index(name='counts')
lmstat.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 6227 entries, 0 to 6226
Data columns (total 3 columns):
 #   Column  Non-Null Count  Dtype 
---  ------  --------------  ----- 
 0   TICKER  6227 non-null   object
 1   YEAR    6227 non-null   int64 
 2   counts  6227 non-null   int64 
dtypes: int64(2), object(1)
memory usage: 146.1+ KB


In [10]:
# remove firms with fewer than 100 daily stock price records during a fiscal year
filtered_tickers = lmstat[lmstat.counts >= 100]

In [11]:
filtered_stocks= filtered_stocks.merge(filtered_tickers, on=['TICKER','YEAR'],how='inner')

In [12]:
filtered_stocks['VND_VOLUME'] = filtered_stocks['CLOSE']*filtered_stocks['VOLUME']*1000

In [13]:
filtered_stocks.sort_values(by=['TICKER','DATE'],inplace=True)

In [33]:
lstgr = []
for name,gr in filtered_stocks.groupby(['TICKER','YEAR']):
    print(name,end=';')
    gr['PREV_CLOSE'] = gr['CLOSE'].shift(1,fill_value=gr['CLOSE'].iloc[0])
    gr['RET'] = np.log(gr['CLOSE']/gr['PREV_CLOSE'])
    gr['RET_ON_VOL'] = gr['RET']/gr["VND_VOLUME"]
    lstgr.append(gr)
liquid_measure = pd.concat(lstgr)

('AAV', 2018);('AAV', 2019);('ABC', 2017);('ABC', 2018);('ABC', 2019);('ABR', 2019);('ACE', 2010);('ACE', 2016);('ACE', 2017);('ACE', 2018);('ACM', 2015);('ACM', 2016);('ACM', 2017);('ACM', 2018);('ACM', 2019);('ACV', 2017);('ACV', 2018);('ACV', 2019);('ADP', 2012);('ADP', 2016);('ADP', 2017);('ADP', 2018);('AFX', 2017);('AFX', 2018);('AFX', 2019);('AMC', 2016);('AMC', 2017);('AMC', 2018);('AME', 2010);('AME', 2011);('AME', 2012);('AME', 2013);('AME', 2014);('AME', 2015);('AME', 2019);('AMS', 2017);('AMS', 2018);('AMS', 2019);('AMV', 2010);('AMV', 2011);('AMV', 2012);('AMV', 2013);('AMV', 2014);('AMV', 2015);('AMV', 2016);('AMV', 2017);('AMV', 2018);('AMV', 2019);('ANT', 2017);('APF', 2018);('APF', 2019);('APP', 2011);('APP', 2012);('APP', 2013);('APP', 2014);('APP', 2015);('ASA', 2012);('ASA', 2013);('ASA', 2014);('ASA', 2015);('ASA', 2016);('ASA', 2017);('ASA', 2018);('ATB', 2018);('ATB', 2019);('ATS', 2016);('ATS', 2017);('ATS', 2018);('ATS', 2019);('AVC', 2018);('AVF', 2011);('AVF'

In [36]:
liquid_measure.head()

Unnamed: 0,TICKER,DATE,CLOSE,VOLUME,YEAR,counts,VND_VOLUME,PREV_CLOSE,RET_ON_VOL
383,AAV,20180625,12.3657,362000,2018,134,4476383000.0,12.3657,0.0
382,AAV,20180626,13.5397,800200,2018,134,10834470000.0,12.3657,8.371394e-12
381,AAV,20180627,12.9136,532700,2018,134,6879075000.0,13.5397,-6.88248e-12
380,AAV,20180628,12.9918,210300,2018,134,2732176000.0,12.9136,2.20973e-12
379,AAV,20180629,13.0701,108100,2018,134,1412878000.0,12.9918,4.252873e-12


In [42]:
Amihud_Illiquidity = liquid_measure[['TICKER','YEAR','RET_ON_VOL']].groupby(['TICKER','YEAR']).agg(['sum','count'])

In [46]:
Amihud_Illiquidity[-np.log()]

TICKER  YEAR
AAV     2018   -1.180289e-09
        2019   -1.018378e-09
ABC     2017    1.540031e-07
        2018   -6.842783e-09
        2019    9.612627e-08
                    ...     
YBC     2009    7.563304e-08
        2010    3.114849e-08
        2011   -6.937307e-08
        2012    1.066443e-07
YTC     2018    4.281450e-08
Name: sum, Length: 3495, dtype: float64