Skip to content

TiesdeKok/acctg-579B-python-for-business-research

Repository files navigation

ACCTG 579B - Python for business PhD students

Creator: Ties de Kok

Edition: Standalone

(last big update: August 2021, last minor update: August 2023)


This is a standalone version of my former ACCTG 579B Ph.D. class.

A few important notes and caveats:

  • You can complete the course at your own pace by following the steps below in chronological order.
  • This was a full-credit class, so the time required to complete all the problem sets is between two to eight weeks, depending on how much time you work on it daily.
  • The web scraping problem set involves interacting with third parties websites that can change over time. If that is the case, you can ignore those parts or complete the spirit of the exercises using a different website.
  • The NLP and ML classes last received a major update in 2021, and they lack a discussion about all the recent developments in that space. I have added a few extra of my materials at the end to cover those gaps.
  • Yes, ChatGPT and GPT-4 can solve many of these problem sets for you as they are reasonably straightforward. However, for real research projects, you often end up in a scenario where ChatGPT/GPT-4 cannot solve your problem and you need to figure it out yourself. You are welcome to use AI coding co-pilots to improve your learning, but by simply copy-pasting things from ChatGPT into Jupyter, you generally don't learn much. So be responsible; the ultimate objective is to learn how to use Python, not to solve the problem sets.

Step 0: download the materials

Click the big green "Clone or download" button and click "Download ZIP".

Create a new folder on your computer and extra the ZIP file in that directory.


Step 1: set up your Python environment

Complete the steps in the following document:

Set up your Python environment


Session 1: an introduction to Python and Jupyter

Part 1: Watch the following videos in order:

Part 2: Complete session 1 problem set.


Session 2: Data wrangling with Pandas

Part 1: Watch the following videos in order:

Part 2: Complete session 2 problem set.


Session 3: Collecting data from the internet (i.e., web scraping)

Part 1: Watch the following videos in order:

Part 2: Complete session 3 problem set.

Part 3: optional reference recordings


Session 4: Working with textual data (i.e., Natural Language Processing / Textual Analysis)

Part 1: Watch the following videos in order:

Part 2: Complete session 4 problem set.

Part 3: optional reference recordings


Session 5: Supervised machine learning

Part 1: Watch the following videos in order:

Part 2: Complete session 5 problem set.


Session 6: Unsupervised machine learning

Part 1: Watch the following videos in order:

Part 2: Complete session 6 problem set.


Session 7: Using WRDS and Python

Part 1: Watch the following videos in order:

Part 2: No problem set.


Session 8: Best practices

Part 1: Watch the following videos in order:

Part 2: No problem set.


Bonus material - modern NLP and AI

My 2023 EAA session on modern NLP techniques for Accounting research:

My 2023 PyData talk on Generative AI and NLP:

My paper on the application of Generative LLMs for NLP tasks in Accounting research:

My medium article on using ChatGPT for downstream NLP tasks:


Looking for solutions?

I've uploaded my solutions to this repository as a (zip file). However, it is password protected so that you don't accidentally open it.

You cannot unsee the answer! For your own research projects you have no answer key and you have to figure it out, there is no other way. Treat the problem sets similarly until you've given it your all. After that you can cross-reference your answer with mine, which can sometimes be a good learning experience.

Ok, are you sure you want to open the solutions? Alright, the password is below.

Show password RememberNoNeuralyzer!

About

This is a standalone version of my former ACCTG 579B phd class on Python programming for business research.

Resources

Stars

Watchers

Forks