Handle outliers in Altman-Z #46

ijyliu · 2024-03-05T22:53:14Z

check sectors - banks etc.

maybe don't winsorize

ijyliu · 2024-03-09T22:14:09Z

@OwenLin2001 to investigate sectors

ijyliu · 2024-03-15T23:42:29Z

are the extreme observations in a particular sector? if so, reconsider winsorizing

OwenLin2001 · 2024-04-01T00:19:40Z

With the new dataset on All_Data_with_NLP_Features, the issue is much more mild. All the Altman Z scores are below 8 with 13 companies above a score of 6. Out of the 13 companies, we see big companies like Google and Chevron.

Sector-wise, IT, Health Care, and Energy seems to be the three sectors with high Altman-Z score.

I think no further action is needed regarding Altman-Z score outside of these observation.

ijyliu · 2024-04-01T00:23:08Z

This issue is in the financial data cleaning file, not all data. Once it's in all data it's already been winsorised

…

On Sun, Mar 31, 2024, 5:20 PM OwenLin2001 ***@***.***> wrote: With the new dataset on All_Data_with_NLP_Features, the issue is much more mild. All the Altman Z scores are below 8 with 13 companies above a score of 6. Out of the 13 companies, we see big companies like Google and Chevron. Sector-wise, IT, Health Care, and Energy seems to be the three sectors with high Altman-Z score. I think no further action is needed regarding Altman-Z score outside of these observation. — Reply to this email directly, view it on GitHub <#46 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AQCGE4OMBSVTR3K3E7VG6DLY3CR3DAVCNFSM6AAAAABEH7YFHKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAMRYHE3DGOJUHE> . You are receiving this because you modified the open/close state.Message ID: ***@***.***>

ijyliu · 2024-04-01T00:24:53Z

It's in this notebook https://github.com/current12/Stat-222-Project/blob/main/Code%2FData%20Loading%20and%20Cleaning%2FTabular%20Financial%2FCombine%20and%20Clean%20Tabular%20Financial%20Statements%20Data.ipynb

…

On Sun, Mar 31, 2024, 5:22 PM Isaac Liu ***@***.***> wrote: This issue is in the financial data cleaning file, not all data. Once it's in all data it's already been winsorised On Sun, Mar 31, 2024, 5:20 PM OwenLin2001 ***@***.***> wrote: > With the new dataset on All_Data_with_NLP_Features, the issue is much > more mild. All the Altman Z scores are below 8 with 13 companies above a > score of 6. Out of the 13 companies, we see big companies like Google and > Chevron. > > Sector-wise, IT, Health Care, and Energy seems to be the three sectors > with high Altman-Z score. > > I think no further action is needed regarding Altman-Z score outside of > these observation. > > — > Reply to this email directly, view it on GitHub > <#46 (comment)>, > or unsubscribe > <https://github.com/notifications/unsubscribe-auth/AQCGE4OMBSVTR3K3E7VG6DLY3CR3DAVCNFSM6AAAAABEH7YFHKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAMRYHE3DGOJUHE> > . > You are receiving this because you modified the open/close state.Message > ID: ***@***.***> >

ijyliu · 2024-04-01T00:27:06Z

You will have to find the outliers before they are wisnorized and save them. Then join sector information on. You can also join on the fixed quarter date and companies in all data NLP to see which of the outliers are relevant

…

On Sun, Mar 31, 2024, 5:24 PM Isaac Liu ***@***.***> wrote: It's in this notebook https://github.com/current12/Stat-222-Project/blob/main/Code%2FData%20Loading%20and%20Cleaning%2FTabular%20Financial%2FCombine%20and%20Clean%20Tabular%20Financial%20Statements%20Data.ipynb On Sun, Mar 31, 2024, 5:22 PM Isaac Liu ***@***.***> wrote: > This issue is in the financial data cleaning file, not all data. Once > it's in all data it's already been winsorised > > On Sun, Mar 31, 2024, 5:20 PM OwenLin2001 ***@***.***> > wrote: > >> With the new dataset on All_Data_with_NLP_Features, the issue is much >> more mild. All the Altman Z scores are below 8 with 13 companies above a >> score of 6. Out of the 13 companies, we see big companies like Google and >> Chevron. >> >> Sector-wise, IT, Health Care, and Energy seems to be the three sectors >> with high Altman-Z score. >> >> I think no further action is needed regarding Altman-Z score outside of >> these observation. >> >> — >> Reply to this email directly, view it on GitHub >> <#46 (comment)>, >> or unsubscribe >> <https://github.com/notifications/unsubscribe-auth/AQCGE4OMBSVTR3K3E7VG6DLY3CR3DAVCNFSM6AAAAABEH7YFHKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAMRYHE3DGOJUHE> >> . >> You are receiving this because you modified the open/close state.Message >> ID: ***@***.***> >> >

OwenLin2001 · 2024-04-01T02:08:46Z

Pre-winsorized data exhibits a similar trend.
For Altman Z > 6, the top 4 sectors (after inner join pre-winsorized data with all_data_nlp on tickers) are

IT - 15
Consumer Discretionary - 7
Health Care - 6
Energy - 6

Among companies that are outliers in pre-winsorized data but are not outliers in the all_data_nlp, there isn't a trend.
After winsorized, some of the companies remain a high score (eg. AAPL at 4.32) and some of them goes down (eg. MUR at 1.20)

What are some expected outcome in your envision after inspect Altman Z outliers?

ijyliu · 2024-04-01T02:42:00Z

It's a little predictable that some tech companies are scoring very high, they probably have near zero liabilities. The other sectors are kind of big sectors.

I think I'm good with winsorizing as is, even if maybe we should be doing it a little bit less for IT. The process will still maintain fairly high scores for the outlier companies.

ijyliu closed this as completed in 577298d Mar 6, 2024

ijyliu self-assigned this Mar 6, 2024

ijyliu reopened this Mar 8, 2024

ijyliu assigned OwenLin2001 and unassigned ijyliu Mar 9, 2024

ijyliu closed this as completed Apr 1, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Handle outliers in Altman-Z #46

Handle outliers in Altman-Z #46

ijyliu commented Mar 5, 2024 •

edited

Loading

ijyliu commented Mar 9, 2024

ijyliu commented Mar 15, 2024

OwenLin2001 commented Apr 1, 2024

ijyliu commented Apr 1, 2024 via email

ijyliu commented Apr 1, 2024 via email

ijyliu commented Apr 1, 2024 via email

OwenLin2001 commented Apr 1, 2024

ijyliu commented Apr 1, 2024

Handle outliers in Altman-Z #46

Handle outliers in Altman-Z #46

Comments

ijyliu commented Mar 5, 2024 • edited Loading

ijyliu commented Mar 9, 2024

ijyliu commented Mar 15, 2024

OwenLin2001 commented Apr 1, 2024

ijyliu commented Apr 1, 2024 via email

ijyliu commented Apr 1, 2024 via email

ijyliu commented Apr 1, 2024 via email

OwenLin2001 commented Apr 1, 2024

ijyliu commented Apr 1, 2024

ijyliu commented Mar 5, 2024 •

edited

Loading