# Employment data

We examine U.S. employment data, especially 
**Nonfarm Payroll** which has considerable influence 
in financial markets. By constructing models, 
we derive error bounds for forecasting NFP.

*Dependencies:*

    - Linux, bash [not critical, generally cross-platform]
    - Python: matplotlib, pandas [recommend Anaconda distribution]
    - Modules: yi_1tools, yi_plot, yi_timeseries, yi_fred
     
*CHANGE LOG*

    2015-11-12  Preliminary draft.

In [None]:
#  NOTEBOOK v4 settings and system details:      [00-tpl v4.15.0812]

#  Assume that the backend is LINUX (e.g. Ubuntu running bash shell):
print '\n ::  TIMESTAMP of last notebook execution:'
!date
print ' ::  IPython version:'
!ipython --version

#  Automatically RELOAD modified modules:
%load_ext autoreload
%autoreload 2
#           0 disables autoreload.

#  DISPLAY options
from IPython.display import Image 
#  e.g. Image(filename='holt-winters-equations.png', embed=True) # url= also works
from IPython.display import YouTubeVideo
#  e.g. YouTubeVideo('1j_HxD4iLn8', start='43', width=600, height=400)
from IPython.display import HTML # useful for snippets
#  e.g. HTML('<iframe src=http://en.mobile.wikipedia.org/?useformat=mobile width=700 height=350></iframe>')
from IPython.core import page
get_ipython().set_hook('show_in_pager', page.as_hook(page.display_page), 0)
#  Or equivalently in config file: "InteractiveShell.display_page = True", 
#  which will display results in secondary notebook pager frame in a cell.

#  MATH display, use %%latex, rather than the following:
#                from IPython.display import Math
#                from IPython.display import Latex
#  Generate PLOTS inside notebook:
%matplotlib inline

import pandas as pd
print ' ::  pandas version:'
print pd.__version__
#      pandas DataFrames are represented as text by default; enable HTML representation:
#      [Deprecated: pd.core.format.set_printoptions( notebook_repr_html=True ) ]
pd.set_option( 'display.notebook_repr_html', False )

print ' ::  Working directory (set as $workd):'
workd, = !pwd
print workd + '\n'

In [None]:
from fecon import *

## Notes on NFP

FRED states: "Total **Nonfarm Payroll** is a measure of the number of 
U.S. workers in the economy that *excludes proprietors, private household 
employees, unpaid volunteers, farm employees, and the unincorporated self-employed*. 
This measure accounts for approximately 80 percent of the workers who 
contribute to Gross Domestic Product (GDP).

This measure provides useful insights into the current economic situation 
because it can represent the number of jobs added or lost in an economy. 
Increases in employment might indicate that businesses are hiring which 
might also suggest that businesses are growing. Additionally, those who 
are newly employed have increased their personal incomes, which 
means (all else constant) their disposable incomes have also increased, 
thus fostering further economic expansion.

Generally, the U.S. labor force and levels of employment and unemployment 
are subject to fluctuations due to seasonal changes in weather, major holidays, 
and the opening and closing of schools. The Bureau of Labor Statistics (BLS) 
adjusts the data to offset the seasonal effects to show non-seasonal changes: 
for example, women's participation in the labor force; or a general decline 
in the number of employees, a possible indication of a downturn in the economy. 

To closely examine seasonal and non-seasonal changes, the BLS releases 
two **monthly** statistical measures: the seasonally adjusted All Employees: 
Total Nonfarm (PAYEMS) and All Employees: Total Nonfarm (PAYNSA), 
which is not seasonally adjusted."

**The market keenly watches PAYEMS, thus we set *m4nfp* to that measure.**

In [None]:
#  Nonfarm Payroll workers in thousands, seasonally adjusted:
nfp = get( m4nfp )

In [None]:
#  We know the number of total workers, 
#  so test the assertion that NFP
#  "accounts for approximately 80 percent of the workers."
workers = get( m4workers )
nfp_ratio = todf( nfp / workers )

In [None]:
plot( nfp_ratio )

So NFP currently accounts for **approximately 75% 
of all workers** *currently*. The nfp_ratio plot 
clearly shows the shift away from farms and into 
corporations since WWII.

In [None]:
plot( nfp )

In [None]:
#  NFP clearly has a linear trend over the long-term.
nfp_trend = trend( nfp )

In [None]:
#  Set slope for later calculation:
nfp_slope = 128.98
#  as of 2015-11-12: 128.98

The slope of the trend gives us a **baseline expectation: 
on average 129,000 nonfarm workers should added monthly** 
(given data from 1939 to 2015). At the core, 
this involves some assumptions about the US birthrate.

In [None]:
#  Take the first difference, i.e. Month over Month:
nfp_dif = dif( nfp, 1 )

In [None]:
#  Subtract the long-term base expected change:
nfp_dif_base = todf( nfp_dif - nfp_slope )

In [None]:
#  Our data goes back to 1939, 
#  so let's plot a recent subsegment:
plot( nfp_dif_base['2000':] )

During the Great Recession, there were several months where about one million jobs were lost. Consistent above average job additions started after 2011. 

The cumulative effect can be seen by looking at the deviations from trend.

In [None]:
nfp_dev = nfp - nfp_trend

In [None]:
plot( nfp_dev )

In [None]:
#  Zoom in on deviations post-2000:
plot( nfp_dev['2000':])

From 2008 through 2011, over 11 million workers were removed from NFP. 
As of November 2015 the recovery has been gradual, but still sub-trend. 

## Forecasting NFP

Examining the first difference series is insightful in observing 
how the economy is developing over the short-term. To forecast NFP, 
however, it is instructive to go back to the original series and use 
the Holt-Winters method.

In [None]:
#  Forecast 12 months into the future:
forecast( nfp, 12 )

In [None]:
#  2015-11-12, forecast NFP change for Nov 2015:
142964 - 142654

In [None]:
#  Estimate dispersion:
stat( nfp_dif_base['2000':] )

The point forecast is clouded by a rather large standard error of 232, 
which shows the uselessness of this approach.

An alternative is to notice how nfp_dif has been range bound since 2011. 
The mean can serve as a point forecast for the change in NFP, 
and the std gives us some estimate of the standard error.

In [None]:
stat( nfp_dif['2011':] )

2015-11-12:  Our forecasted NFP change for Nov 2015 = **+205,000** +/- 77,000 at 1 std.
(So Fed would be discouraged by a change < +128,000 which happens to be 
our computed baseline trend.) 

The street mean prediction for Oct 2015 
was +185,000, but the actual turned out to be +271,000 -- whereupon 
Bill Gross declared 100% chance of December rate hike.

## Relationship to equities

Here we explore the linear relationship between NFP and SPX (S&P 500 index).

In [None]:
#  Resampled to monthly, the NFP frequency:
spx = get( m4spx )

In [None]:
stat2( nfp[Y], spx[Y] )

The correlation between NFP and SPX is about 88%, and R-squared for this linear regression is 77%.

The regression can be written as: $ N = b S + c $, where c is the intercept.

Most interesting as a consequence is: $ \Delta N = b \Delta S $ 

The parameter b is estimated to be 45.31 (as 2015-11-12).

In [None]:
#  The baseline change in NFP, translated as change in SPX:
b = 45.31
spx_slope = nfp_slope / b
spx_slope

In [None]:
#  Assume SPX level of 2000, and calculate annualized rate:
spx_rate = (spx_slope / 2000.0) * 1200
spx_rate

**Annual rate of +1.71% in SPX roughly corresponds *currently* 
to +129,000 additional workers on NFP.** 

In [None]:
#  SPX geometric mean return
georet( spx, 12 )

In [None]:
#  NFP geometric growth rate:
georet( nfp, 12 )

In [None]:
#  Another standard error estimate for NFP:
tailvalue( nfp ) * 0.0139 / ( 12 ** 0.5 )
#                  ^annualized volatility scaled back monthly

Relying on the regression to forecast NFP can become too dangerous when the SPX is volatile. 
For example, assuming +7.0% SPX annualized return, the forecasted NFP change 
would be +528,000 which is realistically out-of-bounds.

The natural growth rate for NFP appears to be around 2.0%. 
NFP volatility is about 11% of SPX, however, 
the implied standard error for forecasting monthly 
changes is much larger than our localized models. 

### Closing remarks

When the stock market is booming, it is reasonable to expect more corporate workers 
on the payroll. It seems reasonably correlated, but the data shows that 
a linear regression model is too simplistic to capture the volatile dynamics.