# Notebook for creating the figures and data for "S Jobs Streak: Now Second Longest in 86 years"

This notebook provides the code for replicating the figures and data from the article "[US Jobs Streak: Now Second Longest in 86 Years](https://rickecon.substack.com/p/us-jobs-streak-now-second-longest)" by [Richard W. Evans](https://sites.google.com/site/rickecon) (GitHub: [@rickecon](https://github.com/rickecon), X: [@RickEcon](https://twitter.com/RickEcon), Substack: [Econosseur](https://rickecon.substack.com/)). A GitHub repository for the analyses in this article has been created at https://github.com/OpenSourceEcon/USempl-Streaks-2025-01.

## 0. Preliminaries: Different ways to replicate the analyses

### 0.1. (Easiest, least flexible option) Install the usempl-plots package from PyPI.org

The easiest way to run the analyses in the article, "US Labor Market: The Resilient 'Streak'er" is to download the [`usempl-plots`](https://pypi.org/project/usempl-plots/) Python package from PyPi.org. This is the approach we use in this notebook.

One drawback to this approach is that you are limited to the options in the functions and modules in the `usempl-plots` package. But these options are sufficient for the analyses in the article.

The approach described in Section 0.2 is more flexible and is likely the most preferred option for someone who knows how to program in Python and who wants to customize the output.

### 0.2. (Hardest, most flexible option) Fork and clone the repository, create the environment, and install the package from your hard drive

This approach allows you to update and customize the functions and modules in the `usempl-plots` package. This is also the best way to contribute fixes and updates back to the `usempl-plots` package repository (https://github.com/OpenSourceEcon/usempl-plots) to improve and expand it (see Section 0.3).

1. Make sure you have Python installed on your computer.
    - Navigate to your terminal and type: `python --version`
    - If you do not have Python, download the free Anaconda distribution of Python from Anaconda.com. Follow the instructions at their download page, https://www.anaconda.com/download. This will require you to give them your email address.
2. Make sure you have Git version control software on your computer.
    - Navigate to your terminal and type: `git version`
    - If nothing comes up, install Git on your computer by following the correct instructions for your operating system at https://git-scm.com/book/en/v2/Getting-Started-Installing-Git.
    - After Git is installed, set up some basic configuration settings such as your name: `git config --global user.name "Your Name"`; your email: `git config --global user.email yourname@example.com`; and an easy editor that you can use from your terminal: `git config --global core.editor vim`.
3. If you don't already have one, sign up for a GitHub account by going to https://github.com/, selecting "Sign Up", and following the instructions.
    - I recommend that you choose a GitHub handle that is relatively short (less than 10 characters) and isn't too wild. This is the handle by which you will be known across all of your opensource interactions. For example, my GitHub handle is [@rickecon](https://github.com/rickecon). Other people versions of their first and/or last name.
4. Fork the `usempl-plots` GitHub repository, which simply means that you are making a copy of it in the cloud on your personal GitHub account. In your internet browser, go to the URL https://github.com/OpenSourceEcon/usempl-plots and click the "Fork" button in the upper-right area of the screen. In the "Owner*" dropdown on that page, make sure your personal GitHub account is selected. This will make a copy of the repository on your GitHub account in the cloud.
5. Clone your forked repository in the cloud to your local computer. Cloning is simply the Git functionality term for downloading the code in the repository to your local computer in a way that sets it up as a Git repository on you machine in which the Git software tracks any changes you make.
    - Go to your terminal and navigate to a folder where you want to place a Git repository. Make sure this is not in a Google Drive, Dropbox, or iCloud folder that is copied back and forth from the web.
    - Once you are in that directory on your computer, type the following. Note that you are copying the code from your account's copy (fork) of the repository: `git clone https://github.com/[YourGitHubHandle]/usempl-plots.git`
6. Navigate to the new directory on your local hard drive for this new Git repsitory: `cd usempl-plots.git`
7. Create a conda environment for this repository. This is a set of packages and versions that is constant across operating systems and hardware: `conda env create -f environment.yml`
8. Activate the conda environment: `conda activate usempl-plots-dev`
9. Install the `usempl-plots` package from your hard drive: `pip install -e .`

Now you are ready to run the analyses below, and you don't need the step at the beginning of Section 1 that executes the command: `!pip install usempl-plots`.

### 0.3. Contributing to the usempl-plots package

If you follow the approach of Section 0.2 of forking the [`usempl-plots` repository](https://github.com/OpenSourceEcon/usempl-plots) (https://github.com/OpenSourceEcon/usempl-plots), you might find errors or inefficiencies in the code. Or you may find augmentations that make the code more useful or expand the scope of its functionalities.

I encourage you to ask any questions about the code or make any suggested changes by either submitting an issue to the GitHub repository (https://github.com/OpenSourceEcon/usempl-plots/issues) or submitting a pull request of code changes (https://github.com/OpenSourceEcon/usempl-plots/pulls).

In [1]:
# Import packages
import pandas as pd
import numpy as np
import os
from usempl_plots import usempl_streaks
from usempl_plots.get_payems import get_payems_data
from usempl_plots.tseries_payems import gen_payems_tseries
from usempl_plots.usempl_npp import usempl_npp
from usempl_plots.usempl_industry import usempl_ind_chg
from bokeh.io import output_notebook

In [2]:
repo_path = (
    "/Users/richardevans/Docs/Economics/OSE/Substack/USempl-Streaks-2025-01"
)
image_dir = os.path.join(repo_path, "images")
data_dir = os.path.join(repo_path, "data")

## 1. Create statistics in paragraphs before Figure 1

In [3]:
beg_date_str = "1919-01-01"
end_date_str = "2025-01-17"
usempl_df, beg_date_str2, end_date_str2 = get_payems_data(
    beg_date_str="1919-01-01",
    end_date_str=end_date_str,
    file_path=None,
)
usempl_df

Beginning date of U.S. employment series is 1919-07-01
End date of U.S. employment series is 2024-12-01


Unnamed: 0,Date,PAYEMS,BLS_annual,diff_monthly,diff_yoy,Source
0,1919-07-01,2.707800e+07,27078000.0,,,Cubic spline interp. of BLS annual data
1,1919-08-01,2.747546e+07,,397457.629359,,Cubic spline interp. of BLS annual data
2,1919-09-01,2.778468e+07,,309221.098029,,Cubic spline interp. of BLS annual data
3,1919-10-01,2.801165e+07,,226966.928133,,Cubic spline interp. of BLS annual data
4,1919-11-01,2.816234e+07,,150695.119670,,Cubic spline interp. of BLS annual data
...,...,...,...,...,...,...
1261,2024-08-01,1.587700e+08,,78000.000000,2349000.0,BLS monthly series
1262,2024-09-01,1.590250e+08,,255000.000000,2358000.0,BLS monthly series
1263,2024-10-01,1.590680e+08,,43000.000000,2236000.0,BLS monthly series
1264,2024-11-01,1.592800e+08,,212000.000000,2266000.0,BLS monthly series


In [4]:
# Print monthly job gain "monthly_diff" for "Date" = "2024-12-01"
print(
    "Monthly job gain for Dec. 2024 =",
    "{:,}".format(
        int(usempl_df[usempl_df["Date"] == "2024-12-01"][
            "diff_monthly"
        ].values[0])
    )
)
print("")
print(
    "Average monthly job gains for last 2 years (Jan. 2023 to Dec. 2024) =",
    "{:,}".format(
        round(usempl_df[usempl_df["Date"] >= "2023-01-01"][
            "diff_monthly"
        ].mean())
    )
)
print("")
print(
    "Total workers in Dec. 2024 =",
    "{:,}".format(
        int(usempl_df[usempl_df["Date"] == "2024-12-01"]["PAYEMS"].values[0])
    )
)
print("")
print(
    "Total workers in Jan. 1980 =",
    "{:,}".format(
        int(usempl_df[usempl_df["Date"] == "1980-01-01"]["PAYEMS"].values[0])
    )
)
print("")
print(
    "Total workers in Jan. 1939 =",
    "{:,}".format(
        int(usempl_df[usempl_df["Date"] == "1939-01-01"]["PAYEMS"].values[0])
    )
)
print("")
# Generate variable series for monthly percent change in employment
usempl_df["pctchg_monthly"] = usempl_df["PAYEMS"].pct_change() * 100
print(
    "Average monthly percent employment growth (Jul. 1919 to Dec. 2024) =",
    str(usempl_df["pctchg_monthly"].mean().round(2)) + "%"
)
print("")
usempl_df["pctchg_annual"] = usempl_df["PAYEMS"].pct_change(periods=12) * 100
print(
    "Average annual percent employment growth (Jul. 1919 to Dec. 2024) =",
    str(usempl_df["pctchg_annual"].mean().round(2)) + "%"
)
print("")
print(
    "Predicted monthly employment growth for Dec. 2024 based on 0.14% average =",
    "{:,}".format(
        round(
            usempl_df[usempl_df["Date"] == "2024-11-01"]["PAYEMS"].values[0] *
            usempl_df["pctchg_monthly"].mean() / 100
        )
    )
)

Monthly job gain for Dec. 2024 = 256,000

Average monthly job gains for last 2 years (Jan. 2023 to Dec. 2024) = 218,542

Total workers in Dec. 2024 = 159,536,000

Total workers in Jan. 1980 = 90,800,000

Total workers in Jan. 1939 = 29,923,000

Average monthly percent employment growth (Jul. 1919 to Dec. 2024) = 0.14%

Average annual percent employment growth (Jul. 1919 to Dec. 2024) = 1.75%

Predicted monthly employment growth for Dec. 2024 based on 0.14% average = 226,374


## 2. Create Figure 1 (Time series of US total nonfarm employment)

In [5]:
output_notebook()
# fig1_title_str = None
fig1_title_str = (
    "Figure 1. US Total Monthly Nonfarm Payroll Employment (PAYEMS), 1919-2024"
)
fig1, beg_date_str, end_date_str2 = gen_payems_tseries(
    end_date=end_date_str,
    fig_title_str=fig1_title_str,
    save_plot=image_dir
)

Beginning date of U.S. employment series is 1919-07-01
End date of U.S. employment series is 2024-12-01
PAYEMS data downloaded on 2025-01-17 and has most recent PAYEMS data month of 2024-12-01.


## 3. Create Figure 2 (US employment streaks), Figure 3 (US employment streaks scatterplot), and Table 1 (Top 7 streaks)

The following code will print Figures 2 and 3. And the output for Table 1 is printed out in the output from executing the following cell as part of the `usempl_streaks()` function.

In [19]:
output_notebook()
beg_date_strk_str = "1939-01-01"
end_date_strk_str = "2025-01-17"
# strk_line_title_lst = None
strk_line_title_lst = [
    ("Figure 2. US employment streaks: consecutive positive monthly"),
    ("gains and cumulative employment gains, 1939-2024")
]
# strk_scat_title_lst = None
strk_scat_title_lst = [
    ("Figure 3. US employment streaks: consecutive positive monthly"),
    ("gains and average monthly employment gains, 1939-2024")
]
fig_lst, beg_date_str, end_date_str, strk_table_df = usempl_streaks(
    start_date=beg_date_strk_str,
    end_date=end_date_strk_str,
    fig_line_title_lst=strk_line_title_lst,
    fig_scat_title_lst=strk_scat_title_lst,
    indicate_recent_line=(
        "Current streak (48 months): \n 2021-01 to 2024-12",
        90, 270, 36, 18_900_000, 46, 17_400_000
    ),
    indicate_recent_scat=(
        "Current streak (48 months): \n 2021-01 to 2024-12",
        120, 150, 40, 670_000, 46.5, 430_000
    ),
    save_plot=image_dir
)

Beginning date of U.S. employment series is 1939-01-01
End date of U.S. employment series is 2024-12-01
PAYEMS data downloaded on 2025-01-17 and has most recent PAYEMS data month of 2024-12-01.
Number of streaks: 83

Print a table of all streaks.

    strk_num start_date end_date  months_in_streak  total_emp_gain  \
0          1    1939-02  1939-03                 2        357000.0   
1          2    1939-05  1939-06                 2        408000.0   
2          3    1939-08  1940-03                 8       1406000.0   
3          4    1940-05  1940-06                 2        276000.0   
4          5    1940-08  1943-04                33      10705000.0   
..       ...        ...      ...               ...             ...   
78        79    2010-03  2010-05                 3        943000.0   
79        80    2010-08  2010-08                 1          3000.0   
80        81    2010-10  2020-02               113      21969000.0   
81        82    2020-05  2020-11                 7  

In [22]:
# Make Table 1 csv of 7 largest streaks
strk_table_top7_df = strk_table_df[
    (
        (strk_table_df["months_in_streak"] >= 40)
        | (strk_table_df["total_emp_gain"] >= 10_000_000)
    )
][
    [
        "start_date",
        "end_date",
        "months_in_streak",
        "total_emp_gain",
        "avg_monthly_emp_gain",
    ]
].sort_values(
    by=["months_in_streak", "avg_monthly_emp_gain"],
    ascending=[False, False]
)
# Change column names in strk_table_top7_df
strk_table_top7_df.rename(columns={
    "start_date": "Start date",
    "end_date": "End date",
    "months_in_streak": "Months in streak",
    "total_emp_gain": "Total employment gain",
    "avg_monthly_emp_gain": "Average monthly employment gain"
}, inplace=True)
strk_table_top7_df.to_csv(
    os.path.join(data_dir, "table1_top7streaks.csv"), index=False
)
strk_table_top7_df

Unnamed: 0,Start date,End date,Months in streak,Total employment gain,Average monthly employment gain
80,2010-10,2020-02,113,21969000.0,194415.9
82,2021-01,2024-12,48,17018000.0,354541.7
59,1986-07,1990-06,48,10702000.0,222958.3
76,2003-09,2007-06,46,7919000.0,172152.2
53,1975-07,1979-03,45,12958000.0,287955.6
4,1940-08,1943-04,33,10705000.0,324393.9
81,2020-05,2020-11,7,12340000.0,1762857.0
