-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
may you share what book you used for First-Difference Estimator #1
Comments
Angrist, J. D. and Pischke, J. (2014). Mastering ’Metrics: The Path from Cause to Effect, Princeton University Press. Jeffrey M. Wooldridge (2016), Introductory Econometrics: A Modern Approach, 6th Edition, Cengage Learning. Kamada, Vitor. (2020b). Causal Inference with Python. https://causal-methods.github.io/Book Using Python for Introductory Econometrics by Florian Heiss and Daniel Brunner Angrist, Joshua D. and Pischke, Jörn-Steffen (2009). Mostly Harmless Econometrics: An Empiricist's Companion. Princeton University Press Wooldridge, J. (2010). Econometric Analysis of Cross Section and Panel Data. 2ed, Cambridge: MIT Press |
Great thanks
It would very kind of you share simple books for panel data
Or videos simple to understand
Additional to your videos?
…On Thu, May 12, 2022, 7:47 PM Vitor Kamada ***@***.***> wrote:
Angrist, J. D. and Pischke, J. (2014). Mastering ’Metrics: The Path from
Cause to Effect, Princeton University Press.
Jeffrey M. Wooldridge (2016), Introductory Econometrics: A Modern
Approach, 6th Edition, Cengage Learning.
Kamada, Vitor. (2020b). Causal Inference with Python.
https://causal-methods.github.io/Book
Using Python for Introductory Econometrics by Florian Heiss and Daniel
Brunner
Angrist, Joshua D. and Pischke, Jörn-Steffen (2009). Mostly Harmless
Econometrics: An Empiricist's Companion. Princeton University Press
Wooldridge, J. (2010). Econometric Analysis of Cross Section and Panel
Data. 2ed, Cambridge: MIT Press
—
Reply to this email directly, view it on GitHub
<#1 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACXFSR42IME7UKP5N2NS2IDVJWJ7RANCNFSM5VZQ2ETA>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
|
These are the simple and best books for beginners. They have great chapters of Panel Data (Fixed effects, first difference, etc.) Real Econometrics: The Right Tools to Answer Important Questions Introduction to Econometrics (3rd Edition) by STOCK JAMES & W. WATSON MARK | Jan 1, 2017 |
great thanks
can I find there machine learning classification like solutions
lets say matrix from 1000000 rows and 2000 features
and 20 observations for the same id for different times ?
and target is yes or no?
…On Fri, May 13, 2022 at 1:37 PM Vitor Kamada ***@***.***> wrote:
These are the simple and best books for beginners. They have great
chapters of Panel Data (Fixed effects, first difference, etc.)
Real Econometrics: The Right Tools to Answer Important Questions
by Michael Bailey Jan 3, 2019
Introduction to Econometrics (3rd Edition) by STOCK JAMES & W. WATSON MARK
| Jan 1, 2017
—
Reply to this email directly, view it on GitHub
<#1 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACXFSR3FBGDAAUOO2BX2NODVJ2HODANCNFSM5VZQ2ETA>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
|
Econometrics textbooks do no cover Machine Learning. Econometrics focus on causal inference and not forecasting. The exception is Time Series Econometrics. If you want to see examples and solutions for your example, study the book: An Introduction to Statistical Learning |
Yes exactly
Panel data is timeseries data
Then prediction for panel data when we have 200 features?
…On Fri, May 13, 2022, 3:44 PM Vitor Kamada ***@***.***> wrote:
Econometrics textbooks do no cover Machine Learning. Econometrics focus on
causal inference and not forecasting. The exception is Time Series
Econometrics.
If you want to see examples and solutions for your example, study the
book: An Introduction to Statistical Learning
https://www.statlearning.com/
It is easy to find Python code for the examples of book on Internet.
—
Reply to this email directly, view it on GitHub
<#1 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACXFSR6UMA5YOJP2MLESD63VJ2WKXANCNFSM5VZQ2ETA>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
|
Panel data has a time dimension. But Econometrics of Panel Data doesn't deal traditionally with this type of problem: prediction with 200 features. You are better off using Machine Learning textbooks. The combination of Panel Data techniques and Machine learning methods are only covered at high level technical papers. There is no simple book for beginners. You can study both techniques in separated, using different books. |
Panel Data techniques and Machine learning methods are only covered at
high level technical papers
I would try
may you share ?
it seems to be not so bad
https://towardsdatascience.com/assigning-panel-data-to-training-testing-and-validation-groups-for-machine-learning-models-7017350ab86e
https://towardsdatascience.com/a-guide-to-panel-data-regression-theoretics-and-implementation-with-python-4c84c5055cf8
though these is more complicated
Synth R https://cran.r-project.org/web/packages/Synth/Synth.pdf
Susanathey/MCPanel R code https://github.com/susanathey/MCPanel
Synth_inference/synthdid R code https://github.com/synth-inference/synthdid
Ebenmichael/augsynth R code https://github.com/ebenmichael/augsynth
but since there is code, it is possible to learn ...
…On Fri, May 13, 2022 at 4:08 PM Vitor Kamada ***@***.***> wrote:
Panel data has a time dimension. But Econometrics of Panel Data doesn't
deal traditionally with this type of problem: prediction with 200 features.
You are better off using Machine Learning textbooks. The combination of
Panel Data techniques and Machine learning methods are only covered at high
level technical papers. There is no simple book for beginners. You can
study both techniques in separated, using different books.
—
Reply to this email directly, view it on GitHub
<#1 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACXFSR6KFG4S6ZEIPM6XCF3VJ2ZCBANCNFSM5VZQ2ETA>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
|
The article "Assigning Panel Data to Training, Testing and Validation Groups for Machine Learning Models" is about (1) panel data forecasting using Machine Learning Methods. It is what you learn using Machine Learning textbooks. The article "A Guide to Panel Data Regression: Theoretics and Implementation with Python" is about (2). It is what you learn from Econometrics textbooks. If your goal is forecasting, go for Deep Learning (Neural Network). If you want to establish causality, study econometrics. There is no reason to run a marathon with ballet point shoes or dance ballet with running shoes. If you can read the papers of Susan Athey and implement her method, it is excellent. She has been developing methods at the intersection of Causal Inference and Machine Learning. She and her coauthors are using Machine Learning Methods to leverage the Causal Inference Methods. Fundamentally, they are attacking Causal Inference questions. |
The article "Assigning Panel Data to Training, Testing and Validation
Groups for Machine Learning Models" is about (1) panel data forecasting
using Machine Learning Methods. It is what you learn using Machine Learning
textbooks.
supper , thanks for sharing
it is what I ask you
may you share some github code for exactly this kind of solutions for panel
data - many timed measurements for the same samples ?
or books or papers ...
my guess , for example it may be practicable for equipment failure
prediction
like
https://medium.com/swlh/machine-learning-for-equipment-failure-prediction-and-predictive-maintenance-pm-e72b1ce42da1
…On Sat, May 14, 2022 at 12:03 AM Vitor Kamada ***@***.***> wrote:
1. Panel data may refer to that data structure, that is, the same
entities are observed across time. 2) Another meaning is Panel methods
(Econometrics estimators for causal inference, such as fixed effects, fist
difference, DID, etc.
The article "Assigning Panel Data to Training, Testing and Validation
Groups for Machine Learning Models" is about (1) panel data forecasting
using Machine Learning Methods. It is what you learn using Machine Learning
textbooks.
The article "A Guide to Panel Data Regression: Theoretics and
Implementation with Python" is about (2). It is what you learn from
Econometrics textbooks.
If your goal is forecasting, go for Deep Learning (Neural Network). If you
want to establish causality, study econometrics. There is no reason to run
a marathon with ballet point shoes or dance ballet with running shoes.
If you can read the papers of Susan Athey and implement her method, it is
excellent. She has been developing methods at the intersection of Causal
Inference and Machine Learning. She and her coauthors are using Machine
Learning Methods to leverage the Causal Inference Methods. Fundamentally,
they are attacking Causal Inference questions.
—
Reply to this email directly, view it on GitHub
<#1 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACXFSR63GHVQ2BN74MJXRCTVJ4Q2JANCNFSM5VZQ2ETA>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
|
First I would ignore the Panel Data structure and deploy Neural Network using Keras. The best book is: Deep Learning with Python, Second Edition by Francois Chollet | Dec 21, 2021 Another decent approach is xgboost. Book: Hands-On Gradient Boosting with XGBoost and scikit-learn: Perform accessible machine learning and extreme gradient boosting with Python by Corey Wade (Author), Kevin Glynn. If the results are unsatisfactory or/and you want to go deeper, try to integrate the Panel Data structure. [How to process panel data for use in a recurrent neural network (RNN)] |
great thanks
though there is no material
in book
Hands-On Gradient Boosting with XGBoost and scikit-learn: Perform
accessible machine learning and extreme gradient boosting with Python by
Corey Wade (Author), Kevin Glynn.
related to panel data (tabular data with many same samples but with
different time )
do you mean to use
https://stackoverflow.com/questions/40008240/how-to-process-panel-data-for-use-in-a-recurrent-neural-network-rnn
to convert data to tabular data and after to use any tabular data python
package ?
PS
deep learning is data gready , then not practicable ?
like stated in
https://github.com/Amplo-GmbH/AutoML
When log files have to be classified, and there is not enough data for time
series methods (such as LSTMs, ROCKET or Weasel, Boss, etc), one needs to
fall back to classical machine learning models which work better with lower
samples.
…On Sun, May 15, 2022 at 12:54 PM Vitor Kamada ***@***.***> wrote:
First I would ignore the Panel Data structure and deploy Neural Network
using Keras. The best book is: Deep Learning with Python, Second Edition by
Francois Chollet | Dec 21, 2021
Another decent approach is xgboost. Book: Hands-On Gradient Boosting with
XGBoost and scikit-learn: Perform accessible machine learning and extreme
gradient boosting with Python by Corey Wade (Author), Kevin Glynn.
If the results are unsatisfactory or/and you want to go deeper, try to
integrate the Panel Data structure.
Paper: Interpretable Neural Networks for Panel Data Analysis in Economics
Yucheng Yang
<https://arxiv.org/search/econ?searchtype=author&query=Yang%2C+Y>, Zhong
Zheng <https://arxiv.org/search/econ?searchtype=author&query=Zheng%2C+Z>, Weinan
E <https://arxiv.org/search/econ?searchtype=author&query=E%2C+W>
[How to process panel data for use in a recurrent neural network (RNN)]
https://stackoverflow.com/questions/40008240/how-to-process-panel-data-for-use-in-a-recurrent-neural-network-rnn
—
Reply to this email directly, view it on GitHub
<#1 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACXFSR65MDQYINIHMNMS7LLVKET3DANCNFSM5VZQ2ETA>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
|
Before you said a Matrix with 1000000 rows. This is more than enough for Deep Learning. The estimators of Panel Data use the information that we observe the same unit at a different point in time. Let's say that we observe the revenue of Microsoft over several years. The observations (rows) of Microsoft are likely to be dependent because, at the end of the day, they are observations of the same company Microsoft. This information is useful to mitigate bias, that is, to deal with endogeneity problems. This is unlikely to improve the accuracy of the forecasting. The Machine Learning algorithm is designed to maximize forecasting. Panel Data is not the typical data structure of most Machine Learning problems. Panel Data estimators are actually transforming data (time demeaning, fist difference, etc). All these transformations in data are not useful for forecasting. Each Machine Learning algorithm needs the data in a "certain way". Whatever the way, is your job to make the modifications. Even for Panel Data estimators, you have to set (declare) the time and unity of analysis variables. In this case, you would have two columns as indices. Usually, you cannot use this data format for Machine Learning algorithms. If you have a small sample size, use whatever Machine Learning algorithm is more appropriate. |
Even if Panel Data, you can run the regular OLS that ignores the Panel Data Structure. In this case, each observation of Microsoft is treated as independent. Obvious the results are different. The regular OLS suppose to be biased. Roughly speaking, the Machine Learning algorithm does the same as regular OLS. |
yes it is exactly what I try to find :
project with code for panel data to build ML model for
classification/regression
to learn by example how to deal with panel data
I very surprised it is very difficult to find such kind of github
repository
meaning to not do this
" Even if Panel Data, you can run the regular OLS that ignores the Panel
Data Structure. In this case, each observation of Microsoft is treated as
independent. Obvious the results are different. The regular OLS suppose to
be biased. Roughly speaking, the Machine Learning algorithm does the same
as regular OLS."
but proper ML solution
…On Sun, May 15, 2022 at 6:37 PM Vitor Kamada ***@***.***> wrote:
Even if Panel Data, you can run the regular OLS that ignores the Panel
Data Structure. In this case, each observation of Microsoft is treated as
independent. Obvious the results are different. The regular OLS suppose to
be biased. Roughly speaking, the Machine Learning algorithm does the same
as regular OLS.
—
Reply to this email directly, view it on GitHub
<#1 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACXFSR5HDCA5DNFKCBSVO73VKF4C5ANCNFSM5VZQ2ETA>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
|
great material thanks
only may you share what book you used for First-Difference Estimator
https://www.youtube.com/watch?v=p9NhSrTugYM&list=PLOQU3c_3DSpLTBa0vqPFVwDCqXlXiu49j&index=55
also for all DiD
14.2) Algebra of Difference-in-Differences (DID)
14.3) Python: Diff-in-Diff (DD)
14.4) Quasi-Experiment Diff-in-Diff (DID)
what else material may be helpful to understand DiD?
and alpha_i
The text was updated successfully, but these errors were encountered: