### Machine Learning for Econonomics Journal Abstracts

In [2]:

import pandas as pd
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.feature_extraction.text import HashingVectorizer
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split

In [113]:
topJourns_df = pd.read_csv("raw_data_econ/topRanks_cleaned.csv", encoding = "'iso-8859-1'")
hiJourns_df = pd.read_csv("raw_data_econ/hiRanks_cleaned.csv", encoding = "'iso-8859-1'")
notHiJourns_df = pd.read_csv("raw_data_econ/notHiRanks_cleaned.csv", encoding = "'iso-8859-1'")
notHiJourns2_df = pd.read_csv("raw_data_econ/notHiRanks2_cleaned.csv", encoding = "'iso-8859-1'")

frames = [topJourns_df, hiJourns_df, notHiJourns_df, notHiJourns2_df]
combined_df = pd.concat(frames)
combined_df["abstract"].shape

(8229,)

In [142]:
# split data into test & train
X = combined_df["abstract"]
y = combined_df["top_journal"]
X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=1, stratify=y)

# transform X and y to lists for processing
X_train = X_train.tolist()
X_test = X_test.tolist()
y_train = y_train.tolist()
y_test = y_test.tolist()

In [143]:
# Fit to logistic regression function
classifier = LogisticRegression()

#word to vector
tfidf_vectorizer = TfidfVectorizer(analyzer='word', ngram_range=(1, 3))
hash_vectorizer = HashingVectorizer(analyzer='word', ngram_range=(1, 3),n_features=50000)
X_train=hash_vectorizer.fit_transform(X_train)
X_test=hash_vectorizer.fit_transform(X_test)

classifier.fit(X_train, y_train)

LogisticRegression(C=1.0, class_weight=None, dual=False, fit_intercept=True,
          intercept_scaling=1, max_iter=100, multi_class='ovr', n_jobs=1,
          penalty='l2', random_state=None, solver='liblinear', tol=0.0001,
          verbose=0, warm_start=False)

In [86]:
combined_df.head()

Unnamed: 0.1,Unnamed: 0,abstract,top_journal
0,0,We propose local measure relationship paramete...,1
1,1,The labor market increasingly rewards social s...,1
2,2,We develop theory endogenous uncertainty busin...,1
3,3,What shapes optimal degree progressivity tax t...,1
4,4,An increase household debt GDP ratio predicts ...,1


In [144]:
# training and testing data score
print(f"Training Data Score: {classifier.score(X_train, y_train)}")
print(f"Testing Data Score: {classifier.score(X_test, y_test)}")

Training Data Score: 0.9009884945713823
Testing Data Score: 0.6982507288629738


In [88]:
# Making predictions
predictions = classifier.predict(X_test)
pd.DataFrame({"Prediction": predictions, "Actual": y_test}).head(20)

Unnamed: 0,Actual,Prediction
0,1,1
1,1,1
2,0,0
3,0,1
4,0,0
5,0,1
6,1,1
7,0,1
8,1,1
9,0,0


In [145]:
from sklearn.metrics import classification_report
target_names = ["Not top Journal", "Top Journal"]
report = classification_report(y_test, predictions, target_names=target_names)
print(report)

                 precision    recall  f1-score   support

Not top Journal       0.71      0.68      0.70      1034
    Top Journal       0.69      0.71      0.70      1024

    avg / total       0.70      0.70      0.70      2058



# Naive Bayes

In [146]:
combined_df = combined_df[['abstract','top_journal']]
combined_df.head()

Unnamed: 0,abstract,top_journal
0,We propose local measure relationship paramete...,1
1,The labor market increasingly rewards social s...,1
2,We develop theory endogenous uncertainty busin...,1
3,What shapes optimal degree progressivity tax t...,1
4,An increase household debt GDP ratio predicts ...,1


In [147]:
data = combined_df #text in column 1, classifier in column 2.
import numpy as np
numpy_array = data.as_matrix()
X = combined_df["abstract"]
#X=X.astype('float')
Y = combined_df["top_journal"]
#Y=Y.astype('float')
from sklearn.model_selection import train_test_split
X_train, X_test, Y_train, Y_test = train_test_split(
 X, Y, test_size=0.4, random_state=42)

In [148]:
from sklearn.feature_extraction.text import CountVectorizer

from sklearn.feature_extraction.text import TfidfTransformer

from sklearn.naive_bayes import MultinomialNB

In [149]:
from sklearn.pipeline import Pipeline
text_clf = Pipeline([('vect', CountVectorizer(stop_words='english')),
 ('tfidf', TfidfTransformer()),
 ('clf', MultinomialNB()),
])

In [150]:
text_clf = text_clf.fit(X_train,Y_train)

In [151]:
predicted = text_clf.predict(X_test)
np.mean(predicted == Y_test)


0.672539489671932

# SVM

In [152]:
# Training Support Vector Machines - SVM and calculating its performance

from sklearn.linear_model import SGDClassifier
text_clf_svm = Pipeline([('vect', CountVectorizer()), ('tfidf', TfidfTransformer()),
                         ('clf-svm', SGDClassifier(loss='hinge', penalty='l2',alpha=1e-4, n_iter=2, random_state=42))])

text_clf_svm = text_clf_svm.fit(X_train, Y_train)
predicted_svm = text_clf_svm.predict(X_test)
np.mean(predicted_svm == Y_test)



0.6713244228432563

# Grid Search

In [27]:
# Grid Search
# Here, we are creating a list of parameters for which we would like to do performance tuning. 
# E.g. vect__ngram_range; here we are telling to use unigram and bigrams and choose the one which is optimal.

from sklearn.model_selection import GridSearchCV
parameters = {'vect__ngram_range': [(1, 1), (1, 2)], 'tfidf__use_idf': (True, False), 'clf__alpha': (1e-2, 1e-3)}

In [28]:
# Next, we create an instance of the grid search by passing the classifier, parameters 
# and n_jobs=-1 which tells to use multiple cores from user machine.

gs_clf = GridSearchCV(text_clf, parameters, n_jobs=-1)
gs_clf = gs_clf.fit(X_train, Y_train)

In [29]:
# To see the best mean score and the params, run the following code

gs_clf.best_score_
gs_clf.best_params_

# Output for above should be: The accuracy has now increased to ~90.6% for the NB classifier (not so naive anymore! 😄)
# and the corresponding parameters are {‘clf__alpha’: 0.01, ‘tfidf__use_idf’: True, ‘vect__ngram_range’: (1, 2)}.

{'clf__alpha': 0.01, 'tfidf__use_idf': False, 'vect__ngram_range': (1, 2)}

In [30]:
# Similarly doing grid search for SVM
from sklearn.model_selection import GridSearchCV
parameters_svm = {'vect__ngram_range': [(1, 1), (1, 2)], 'tfidf__use_idf': (True, False),'clf-svm__alpha': (1e-2, 1e-3)}

gs_clf_svm = GridSearchCV(text_clf_svm, parameters_svm, n_jobs=-1)
gs_clf_svm = gs_clf_svm.fit(X_train, Y_train)


gs_clf_svm.best_score_
gs_clf_svm.best_params_



{'clf-svm__alpha': 0.001, 'tfidf__use_idf': True, 'vect__ngram_range': (1, 1)}

In [153]:
#Stemming Code

import nltk
nltk.download()

from nltk.stem.snowball import SnowballStemmer
stemmer = SnowballStemmer("english", ignore_stopwords=True)

class StemmedCountVectorizer(CountVectorizer):
    def build_analyzer(self):
        analyzer = super(StemmedCountVectorizer, self).build_analyzer()
        return lambda doc: ([stemmer.stem(w) for w in analyzer(doc)])
    
stemmed_count_vect = StemmedCountVectorizer(stop_words='english')

text_mnb_stemmed = Pipeline([('vect', stemmed_count_vect), ('tfidf', TfidfTransformer()), 
                             ('mnb', MultinomialNB(fit_prior=False))])

text_mnb_stemmed = text_mnb_stemmed.fit(X_train, Y_train)

predicted_mnb_stemmed = text_mnb_stemmed.predict(X_test)

np.mean(predicted_mnb_stemmed == Y_test)

showing info https://raw.githubusercontent.com/nltk/nltk_data/gh-pages/index.xml


KeyboardInterrupt: 

# TREE

In [97]:
from sklearn import tree

In [124]:
X_scaler = StandardScaler().fit(X_train.values.reshape(-1, 1))

from sklearn.model_selection import train_test_split
X_train, X_test, Y_train, Y_test = train_test_split(
 encoded_X, Y, test_size=0.4, random_state=42)

clf = tree.DecisionTreeClassifier()
clf = clf.fit(X_train, Y_train)
clf.score(X_test, Y_test)

ValueError: Expected 2D array, got 1D array instead:
array=[1090. 5347. 4402. ... 4317. 7371. 2804.].
Reshape your data either using array.reshape(-1, 1) if your data has a single feature or array.reshape(1, -1) if it contains a single sample.

# Random Forest 

In [122]:





from sklearn.ensemble import RandomForestClassifier
rf = RandomForestClassifier(n_estimators=200)
rf = rf.fit(X_train, y_train)
rf.score(X_test, Y_test)

ValueError: could not convert string to float: 'Conventional economic analyses successful explaining differences living arrangements particularly dramatic increase fraction young adults living parents Mediterranean Europe . This paper presents cultural interpretation . I argue sexual revolution 1970s-by liberalizing parental attitudes-had differential impact living arrangements Northern Southern Europe account closer parent-child ties Southern Europe . Such interpretation easily explain shift living arrangements time also observed North-South differentials . It receives support data living arrangements second-generation immigrants United States , 1970 2000 . This duplication European pattern neutral environment , unemployment benefits , welfare code , macroeconomic conditions suggests major role culture determining living arrangements . ( JEL : D1 , J1 , Z13 ) ( c ) 2007 European Economic Association .'

# KNN

In [100]:
import matplotlib.pyplot as plt
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.neighbors import KNeighborsClassifier

In [121]:
from sklearn.preprocessing import StandardScaler

# Create a StandardScater model and fit it to the training data

X_scaler = StandardScaler().fit(X_train.values.reshape(-1, 1))

AttributeError: 'list' object has no attribute 'values'

In [116]:
from sklearn.preprocessing import LabelEncoder

# Step 1: Label-encode data set
label_encoder = LabelEncoder()
label_encoder.fit(X)
encoded_X = label_encoder.transform(X)

In [132]:
for label, original_class in zip(encoded_X, X):
    print('Original Class: ' + str(original_class))
    print('Encoded Label: ' + str(label))
    print('-' * 12)

Original Class: We propose local measure relationship parameter estimates moments data depend . Our measure computed negligible cost even complex structural models . We argue reporting measure increase transparency structural estimates , making easier readers predict way violations identifying assumptions would affect results . When key assumptions orthogonality error terms excluded instruments , show measure provides natural extension omitted variables bias formula nonlinear models . We illustrate applications published articles several fields economics .
Encoded Label: 6915
------------
Original Class: The labor market increasingly rewards social skills . Between 1980 2012 , jobs requiring high levels social interaction grew nearly 12 percentage points share U.S. labor force . Math-intensive less social jobs ? including many STEM occupations ? shrank 3.3 percentage points period . Employment wage growth particularly strong jobs requiring high levels math skill social skills . To unde

Original Class: This paper examines prices , markups , marginal costs respond trade liberalization . We develop framework estimate markups production data multi ? product firms . This approach require assumptions market structure demand curves faced firms , assumptions firms allocate inputs across products . We exploit quantity price information disentangle markups quantity ? based productivity , compute marginal costs dividing observed prices estimated markups . We use India 's trade liberalization episode examine firms adjust performance measures . Not surprisingly , find trade liberalization lowers factory ? gate prices output tariff declines expected pro ? competitive effects . However , price declines small relative declines marginal costs , fall predominantly input tariff liberalization . The reason incomplete cost pass ? prices firms offset reductions marginal costs raising markups . Our results demonstrate substantial heterogeneity variability markups across firms time suggest 

Encoded Label: 1495
------------
Original Class: Exploiting within-firm , over-time variation plan parameters nearly 10,000 Long Term Disability ( LTD ) policies held US employers , present first empirical analysis determinants private LTD spells . We find shorter waiting period higher replacement rate increase incidence LTD spells . Sixty percent latter effect due mechanical censoring shorter spells , remainder due deterrence spells would continued beyond waiting period . Deterrence driven primarily reduction incidence shorter duration spells less severe disabilities .
Encoded Label: 1131
------------
Original Class: This paper reports results large-scaled randomized controlled experiment comparing public private provision counseling job seekers . The intention-to-treat estimates programs statistically different , workers enrolled private program , implying effect per beneficiary twice large public private program . We find suggestive evidence private firms may insufficiently mastered

Original Class: This paper examines investment choices nonprofit hospitals . It tests shocks cash flows caused performance hospitals ? financial assets affect hospital expenditures . Capital expenditures increase , average , 10 28 cents every dollar received financial assets . The sensitivity similar found earlier shareholder-owned corporations . Executive compensation , salaries , perks respond significantly cash flow shocks . Hospitals apparent tendency overspend medical procedures exhibit higher investment-cash flow sensitivities . The sensitivities higher hospitals appear financially constrained .
Encoded Label: 4580
------------
Original Class: We study impact directors foreign experience firm performance emerging markets . Using unique data set China , exploit introduction policies attract talented emigrants increase supply individuals foreign experience different provinces different times . We document performance increases firms hire directors foreign experience identify channe

Original Class: The question guards guards intimately connected broader questions state capacity establishment monopoly violence society , something often viewed defining feature modern state . But establish monopoly , civilian rulers need build effective military , also control . In paper study governments may solve problem recognize decisions build strong army may ramifications subsequent coups . ( JEL : H11 , H56 ) ( c ) 2010 European Economic Association .
Encoded Label: 3648
------------
Original Class: A notable feature post-World War II civil wars long average duration . We provide theory persistence civil wars . The civilian government successfully defeat rebellious factions creating relatively strong army . In weakly institutionalized polities opens way excessive influence coups military . Civilian governments whose rents largely unaffected civil wars choose small weak armies incapable ending insurrections . Our framework also shows civilian governments need take decisive acti

Original Class: This research paper builds previous literature documents general changes labor market Native American women occurred Great Recession using extracts data Current Population Survey Annual Earnings file , known Merged Outgoing Rotation Groups ( MORG ) . Wages , unemployment , labor market variables Native American women contrasted Native American men white women determine relative change labor market inequality occurred Great Recession .
Encoded Label: 5272
------------
Original Class: We analyze two noncontributory Mexican pension programs elderly . Both paid similar amounts , one paid monthly paid every two months . The Life Cycle Hypothesis suggests frequency benefits payments affect consumption smoothing , find monthly program effective smoothing food expenditure . It also increased doctor visits reduced incidence hunger spells . Under bimonthly program , expenditures food significantly decline paychecks ownership durable goods increased . This suggests importance paym

Original Class: The paper estimates model allows shifts aggressiveness monetary policy time variation distribution macroeconomic shocks . These model features induce variations cyclical properties inflation riskiness bonds . The estimation identifies inflation procyclical late 1990s , economy shifted toward aggressive monetary policy experienced procyclical macroeconomics shocks . Since bonds hedge stock market risks inflation procylical , stock-bond return correlation turned negative late 1990s . The risks encountering countercyclical inflation future could lead upward-sloping yield curve , like data.Received September 11 , 2016 ; editorial decision January 2 , 2017 Editor Stijn Van Nieuwerburgh .
Encoded Label: 3559
------------
Original Class: State-of-the-art term structure models commodity prices serious difficulties extrapolating prices long-maturity futures contracts short-dated contracts . This situation problematic valuing real commodity-linked assets . We estimate nonlinear f

------------
Original Class: A central bank may purchase assets financial crisis exit purchases . Agents rational expectations financial crises rare events , probability central bank purchases assets , exit strategy . Selling assets quickly produces double-dip recession slowly unwinding generates smooth recovery . Expectations exit strategy influence initial effectiveness purchases . Increasing probability purchases crises distorts pre-crisis economy depends upon exit strategy . The welfare benefits unconventional policy may differ ex-ante versus ex-post , preferred exit strategy .
Encoded Label: 8
------------
Original Class: In equilibrium model labor market moral hazard , jobs dynamic contracts , job separations terminations optimal dynamic contracts . Transitions unemployment new jobs modeled process random matching Nash bargaining . Non-employed workers make consumption saving decisions standard growth model , well whether participate labor market . The stationary equilibrium char

IOPub data rate exceeded.
The notebook server will temporarily stop sending output
to the client in order to avoid crashing it.
To change this limit, set the config variable
`--NotebookApp.iopub_data_rate_limit`.

Current values:
NotebookApp.iopub_data_rate_limit=1000000.0 (bytes/sec)
NotebookApp.rate_limit_window=3.0 (secs)



------------
Original Class: At close World War II , future economic development subject wide-ranging debates . Historical experience since shown forecasts uniformly pessimistic . Expectations American economy focused likelihood secular stagnation , continued debated throughout post-war period . Concerns raised late 1960s early 1970s rapid population growth smothering potential economic growth developing countries contradicted , mid- late-1970s , fertility rates began decline rapidly . Predictions food production would keep population growth also proven wrong : 1961 2000 , calories per capita worldwide increased 24 percent , despite doubling global population . The high rates economic growth East Southeast Asia also unforeseen economists . Copyright 2005 , International Monetary Fund
Encoded Label: 498
------------
Original Class: The topic paper inspired something particular interest Michael Mussa , something made major important contributions IMF . The biannual World Economic Outlook

Encoded Label: 1361
------------
Original Class: We consider identification estimation Roy model includes common nonpecuniary utility component associated choice alternative . This augmented Roy model broader applications many polychotomous choice problems addition occupational sorting . We develop pair nonparametric estimators model , derive asymptotics , illustrate small-sample properties series Monte Carlo experiments . We apply one models migration behavior analyze effect Roy sorting observed returns college education . Correcting Roy sorting bias , returns college degree cut half . This article supplementary material online . ( This abstract borrowed another version item . )
Encoded Label: 6018
------------
Original Class: To forecast aggregate , propose adding disaggregate variables , instead combining forecasts disaggregates forecasting univariate aggregate model . New analytical results show effects changing coefficients , mis-specification , estimation uncertainty mis-measurem

Original Class: We investigate separate joint influences social engagement measures stock market participation find socially engaged individuals likely participate . Consistent Granovetter?s theory social networks find weak tie ( measured social group involvement ) positive effect stock market participation whereas strong tie ( measured frequency talking neighbors ) effect . More trusting individuals likely participate stock market , identify political party . In contrast , degree religion important appears little impact .
Encoded Label: 6731
------------
Original Class: I derive test multi-horizon implications consumption-based equilibrium model featuring fluctuating expected growth volatility . My setup allows consumption dynamics estimated jointly covariance risk prices single-stage generalized method moment , inferences asset pricing tests reflect uncertainty coming factor estimation . I show changes consumption volatility key driver explaining major asset pricing anomalies across 

In [126]:
from keras.utils import to_categorical

# Step 2: One-hot encoding
one_hot_X = to_categorical(encoded_X)
one_hot_X

Using TensorFlow backend.


ModuleNotFoundError: No module named 'tensorflow'

In [127]:
print(X.shape)
print(y.shape)

(8229,)
(8229,)


In [128]:
from sklearn.model_selection import train_test_split

X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=1)

In [140]:
from sklearn.preprocessing import LabelEncoder

# Step 1: Label-encode data set
label_encoder = LabelEncoder()
label_encoder.fit(X_train)
encoded_X = label_encoder.transform(X_train)

for label, original_class in zip(encoded_X, X_train):
    print('Original Class: ' + str(original_class))
    print('Encoded Label: ' + str(label))
    print('-' * 12)


Original Class: The intention paper ( ) introduce multi-regional dynamic emissions trading model ( ii ) examine potential impact emissions trading scheme ( ETS ) long-term evolution energy technologies national regional perspectives China . The establishment model salutary attempt Sinicize global integrated assessment model combines economy , energy , environment systems . The simulation results indicate : ( 1 ) majority regions , ETS effective cutting CO2 emissions harmonized carbon tax ( HCT ) , might true entire country , means two options little difference overall carbon reduction ; ( 2 ) carbon tax policy cost-effective option curbing CO2 respect ETS long run ; ( 3 ) neither ETS pure carbon tax provide enough incentives breakthrough carbon-free energy technologies , illustrates matching support policies , subsidies R & D ; investment , essential extend niche market ; ( 4 ) In context ETS , diffusion non-fossil technologies regions act sellers performs much better diffusion buyer r

------------
Original Class: This paper looks empirical theoretical background high shares renewables electricity system . First examine meant `` high shares '' renewables ; next consider mean electricity `` markets '' ; discuss term `` cope '' implies ; returning suitability `` current '' electricity markets . Second , turn three examples jurisdictions - Germany , UK State New York US - specific aspirations decarbonisation role renewables . Each exhibits different approaches way adjusting electricity market design cope high shares renewables . We suggest new wave electricity experiments beginning around theme incorporate large shares intermittent renewable generation electricity systems .
Encoded Label: 3712
------------
Original Class: This paper develops theory endogenous formation common market three-country , two-factor political economy model . In status quo , Home Foreign implement nondiscriminatory policies toward international factor flows order maximize domestic median voter 

------------
Original Class: We assess properties currency value strategies based real exchange rates . We find real exchange rates predictive power cross-section currency excess returns . However , adjusting real exchange rates key country-specific fundamentals ( productivity , quality export goods , net foreign assets , output gaps ) better isolates information related currency risk premium . In turn , resultant measure currency value displays considerably stronger predictive power currency excess returns . Finally , predictive information content currency value measure distinct embedded popular currency strategies , carry momentum.Received June 26 , 2015 ; accepted June 8 , 2016 Editor Stefan Nagel .
Encoded Label: 4519
------------
Original Class: Using unique data set students first regional schools colonial Benin , investigate effect education living standards , occupation , political participation . Since school locations student cohorts selected little information , treatment c

Encoded Label: 1933
------------
Original Class: A new information aggregation mechanism ( IAM ) , developed via laboratory experimental methods , implemented inside Intel Corporation long-running field test . The IAM , incorporating features pari-mutuel betting , uniquely designed collect quantize probability distributions dispersed , subjectively held information . IAM participants ? incentives support timely information revelation emergence consensus beliefs future outcomes . Empirical tests demonstrate robustness experimental results IAM ? practical usefulness addressing real-world problems . The IAM ? predictive distributions forecasting sales accurate , especially short horizons direct sales channels , often proving accurate Intel ? internal forecast .
Encoded Label: 83
------------
Original Class: Investment necessary growth risky often requires external financing . For emerging market , access international credit markets volatile interest rates reflect risk default . We presen

Encoded Label: 2507
------------
Original Class: Using original micro-dataset France , investigate nominal wage stickiness . Nominal wage changes found occur quarterly frequency around 38 percent sample period , large extent staggered across establishments , synchronized within establishments . We carry econometric analysis wage changes based two-threshold sample selection model . Our results timing wage adjustments time-dependent opposed state-dependent , evidence predetermination wage changes , backward forward-looking behavior relevant wage setting . ( JEL E24 , E52 , J31 )
Encoded Label: 4360
------------
Original Class: We integrate housing market labor market dynamic general equilibrium model credit search frictions . We argue labor channel , combined standard credit channel , provides strong transmission mechanism deliver potential solution Shimer ( 2005 ) puzzle . The model confronted U.S. macroeconomic time series . The estimation results account two prominent facts observed d

Original Class: This paper considers panel growth regressions presence model uncertainty reverse causality concerns . For purpose , econometric framework combines Bayesian Model Averaging suitable likelihood function dynamic panel models weakly exogenous regressors fixed effects . An application econometric methodology panel countries 1960-2000 period indicates robust determinant economic growth rate conditional convergence indistinguishable zero . ( This abstract borrowed another version item . )
Encoded Label: 3327
------------
Encoded Label: 2241
------------
Original Class: I provide critical overview literature political decentralization . After reviewing first- second-generation theories federalism , I describe recent empirical studies focusing mainly determinants capture local government accountability emphasized second-generation theories . The article concludes describing emerging new issues deserve attention future research : wider range political distortions beyond capture c

Original Class: Research oil markets conducted last decade challenged long-held beliefs causes consequences oil price shocks . As empirical theoretical models used economists evolved , understanding determinants oil price shocks interaction oil markets global economy . Some key insights real price oil endogenous respect economic fundamentals oil price shocks occur ceteris paribus . As result , one must explicitly account demand supply shocks underlying oil price shocks studying transmission domestic economy . Disentangling cause effect relationship oil prices economy requires structural models global economy including oil market .
Encoded Label: 2171
------------
Original Class: We assess rate replication empirical papers 2010 American Economic Review . Across 70 empirical papers , find 29 percent 1 citation partially replicates original result . While minority papers published replication , majority ( 60 percent ) either replication , robustness test , extension . Surveying authors wi

IOPub data rate exceeded.
The notebook server will temporarily stop sending output
to the client in order to avoid crashing it.
To change this limit, set the config variable
`--NotebookApp.iopub_data_rate_limit`.

Current values:
NotebookApp.iopub_data_rate_limit=1000000.0 (bytes/sec)
NotebookApp.rate_limit_window=3.0 (secs)



In [141]:
from sklearn.preprocessing import StandardScaler

X_scaler = StandardScaler().fit(X_train)

X_train_scaled = X_scaler.transform(X_train)
X_test_scaled = X_scaler.transform(X_test)

ValueError: could not convert string to float: 'The potential Internet-enabled distance learning transform higher education focuses attention exactly residential higher-education institutions students . Two recent books marshal detailed quantitative subjective data individual student outcomes document effects two institutions outcomes might improved . Paying Party concludes Midwestern state university reinforces existing economic inequalities rather fostering upward mobility . How College Works finds northeastern liberal-arts college generally serves students well suggests low-cost improvements . These claims evaluated .'