# Projecting Heightened Maternal Mortality
- _by: Isaac D. Tucker-Rasbury_
- _Start: 6/27/2022_

##### Summary
Ingesting data from reputable sources and conducting a regression analysis to forecast maternal mortality by state. 

##### Objective
* 1. Subject Matter - Mapping Maternal Morbidity Factors On the US and then Layering in abortion care restrictions and bans.
* 2. Skillset Used - data wrangling to visualization.
* 3. Goal - Leverage Python to aggregate data fromb Public APIs.

## Environment Setup and Dataset Imports 

### Initial Imports for creating Data Analytics Environment

In [11]:
# Easiest Import Option

## Libraries for Data Science - Import All Statement
import pyforest

lazy_imports()


['from fbprophet import Prophet',
 'from sklearn.preprocessing import RobustScaler',
 'import sys',
 'import tqdm',
 'from sklearn.model_selection import KFold',
 'from openpyxl import load_workbook',
 'import plotly as py',
 'from sklearn.linear_model import Ridge',
 'import awswrangler as wr',
 'import re',
 'import keras',
 'from sklearn.model_selection import RandomizedSearchCV',
 'import imutils',
 'import statistics',
 'import glob',
 'from sklearn.model_selection import cross_val_score',
 'import altair as alt',
 'from sklearn.preprocessing import StandardScaler',
 'from sklearn.feature_extraction.text import CountVectorizer',
 'from sklearn.linear_model import LogisticRegression',
 'from sklearn.linear_model import LinearRegression',
 'from sklearn.preprocessing import MinMaxScaler',
 'from sklearn.ensemble import GradientBoostingClassifier',
 'import torch',
 'from sklearn.linear_model import ElasticNet',
 'import os',
 'from sklearn import metrics',
 'import spacy',
 'from st

In [12]:
#Pip Update
%pip install --upgrade pip
%pip --version


Note: you may need to restart the kernel to use updated packages.
pip 22.1.2 from /Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/pip (python 3.9)
Note: you may need to restart the kernel to use updated packages.


In [13]:
# Installations and Imports for creating Data Analytics Environment

%pip install --upgrade pyforest
 #Comment | %python -m pyforest install_extensions
%pip install --upgrade folium
%pip install pandas
%pip install matplotlib
%pip install bonobo
%pip install scrapy # Project Description - https://pypi.org/project/Scrapy/
%pip install beautifulsoup4 # Project Description - https://pypi.org/project/beautifulsoup4/
%pip install seaborn
%pip install geoplotlib
%pip install Pyglet
%pip install plotly


Note: you may need to restart the kernel to use updated packages.
Note: you may need to restart the kernel to use updated packages.
Note: you may need to restart the kernel to use updated packages.
Note: you may need to restart the kernel to use updated packages.
Collecting packaging~=19.0
  Using cached packaging-19.2-py2.py3-none-any.whl (30 kB)
Installing collected packages: packaging
  Attempting uninstall: packaging
    Found existing installation: packaging 21.3
    Uninstalling packaging-21.3:
      Successfully uninstalled packaging-21.3
[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
matplotlib 3.5.2 requires packaging>=20.0, but you have packaging 19.2 which is incompatible.[0m[31m
[0mSuccessfully installed packaging-19.2
Note: you may need to restart the kernel to use updated packages.
Note: you may need to restart the kernel to use updated p

### Libraries Imported Explicitly

In [14]:
## Data Mining - explicit mention
import scrapy
from bs4 import BeautifulSoup

In [15]:
## Python to Web Queries Made easier
import requests
import json
import urllib3
from urllib3 import request

In [16]:
## to handle certificate verification
import certifi

In [17]:
## Data Processing and Modeling - explicit mention
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

In [18]:
## Data Viz and Plotting
import seaborn as sns

import geoplotlib
from geoplotlib.utils import read_csv, BoundingBox

import plotly.express as px

## Dataviz Libraries Appendix
# Link: https://towardsdatascience.com/best-libraries-for-geospatial-data-visualisation-in-python-d23834173b35

In [19]:
## Pipeline - ETL Library
import bonobo #Footnote: https://www.bonobo-project.org/


In [21]:
# Handling Certificate Validation (see pip install certifi)
#http = urllib3.PoolManager(
#      cert_reqs==|CERT_REQUIRED|
#      ca_certs==certifi.where())

In [22]:
# to handle certificate verification
import certifi

### Import Datasets and Accompanying Details



In [23]:
# Import Data

## Planning 
## Step 1. Pull data from PDF (
## [Maternal deaths and mortality rates: Each state, the District of Columbia, United States, 2018‐2020](https://www.cdc.gov/nchs/maternal-mortality/MMR-2018-2020-State-Data.pdf)




## Step 2. Webscrape data from guardian website (https://www.theguardian.com/us-news/ng-interactive/2022/jun/28/tracking-where-abortion-laws-stand-in-every-state))
url = 'https://www.theguardian.com/us-news/ng-interactive/2022/jun/28/tracking-where-abortion-laws-stand-in-every-state'
dfs = pd.read_html(url)

print(len(dfs))

URLError: <urlopen error [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1122)>

## Analysis Steps


Planning for the Problem

* Codify the problem
* Brainstorm game plan
* Resources
* Deadlines

# Appendix

## Resources Referenced Throughout Research

***Institutions***
* [Centers for Disease Control and Prevention's National Center for Health Statistics (NCHS)](https://www.cdc.gov/nchs/maternal-mortality/data.htm)

***Data Sources***
* [National Center for Health Statistics. Compressed Mortality File, 1999-2016 (machine readable data file and documentation, CD‑ROM Series 20, No. 2V) as compiled from data provided by the 57 vital statistics jurisdictions through the Vital Statistics Cooperative Program.  Hyattsville, Maryland. 2017.](https://www.cdc.gov/nchs/data_access/cmf.htm)

* [Maternal deaths and mortality rates: Each state, the District of Columbia, United States, 2018‐2020](https://www.cdc.gov/nchs/maternal-mortality/MMR-2018-2020-State-Data.pdf)

* [Centers for Disease Control and Prevention's Reproductive Health Data and Statistics](https://www.cdc.gov/reproductivehealth/data_stats/index.htm)


***Articles/Academic Papers***
* [University of Colorado Boulder - Study: Banning abortion would boost maternal mortality by double digits](https://www.colorado.edu/today/2021/09/08/study-banning-abortion-would-boost-maternal-mortality-double-digits)

* [National Institute of Health's Office of Research on Women's Health - What Are Maternal Morbidity and Mortality?](https://orwh.od.nih.gov/mmm-portal/what-mmm)

* [Nelson, D.B., Moniz, M.H. & Davis, M.M. Population-level factors associated with maternal mortality in the United States, 1997–2012. BMC Public Health 18, 1007 (2018)](https://bmcpublichealth.biomedcentral.com/articles/10.1186/s12889-018-5935-2)

* [The Pew Charitable Trusts | Critics Fear Abortion Bans Could Jeopardize Health of Pregnant Women by Michael Ollove](https://www.pewtrusts.org/en/research-and-analysis/blogs/stateline/2022/06/22/critics-fear-abortion-bans-could-jeopardize-health-of-pregnant-women)

* [National Institutes of Health | Eunice Kennedy Shriver National Institute of Child Health and Human Development | What factors increase the risk of maternal morbidity and mortality?](https://www.nichd.nih.gov/health/topics/maternal-morbidity-mortality/conditioninfo/factors#)

* [The Guardian | Tracking Where Abortion Laws Stand in Every US State](https://www.theguardian.com/us-news/ng-interactive/2022/jun/28/tracking-where-abortion-laws-stand-in-every-state)

## Explaining Employed

| Library | Summary | Elaboration |
| --- | --- | ---- |
| Pyforest | Quick Import All | All of the data science industry standard libraries can be added to the file at once. But please take note to confirm a handful of libraries are on the list.
| Pandas |
| from sklearn.linear_model import RidgeCV|
 |from sklearn.feature_extraction.text import CountVectorizer|
 |import dash|
 |from sklearn.model_selection import StratifiedKFold|
 |import seaborn as sns|
 |from sklearn import metrics|
 |from pathlib import Path|
 |from sklearn.linear_model import LogisticRegression|
 |import os|
 |import tensorflow as tf|
 |import sklearn|
 |import plotly as py|
 |from pyspark import SparkContext|
 |from sklearn.linear_model import LinearRegression|
 |from sklearn.preprocessing import MinMaxScaler|
 |from sklearn.preprocessing import RobustScaler|
 |from sklearn.decomposition import PCA|
 |from PIL import Image|
 |import tqdm|
 |import spacy|
 |import lightgbm as lgb|
 |from sklearn.preprocessing import LabelEncoder|
 |from sklearn.linear_model import LassoCV|
 |from sklearn.manifold import TSNE|
 |import pydot|
 |import keras|
 |from sklearn.ensemble import RandomForestRegressor|
 |import fbprophet|
 |import xgboost as xgb|
 |from sklearn import svm|
 |from sklearn.feature_extraction.text import TfidfVectorizer|
 |from statsmodels.tsa.arima_model import ARIMA|
 |from sklearn.linear_model import ElasticNet|
 |from scipy import stats|
 |from sklearn.preprocessing import OneHotEncoder|
 |import pandas as pd|
 |from sklearn.model_selection import train_test_split|
 |from sklearn.ensemble import GradientBoostingRegressor|
 |import glob|
 |import plotly.graph_objs as go|
 |import skimage|
 |import torch|
 |from sklearn.model_selection import GridSearchCV|
 |import datetime as dt|
 |from xlrd import open_workbook|
 |from sklearn.linear_model import ElasticNetCV|
 |from sklearn.preprocessing import StandardScaler|
 |from sklearn.model_selection import cross_val_score|
 |import statsmodels.api as sm|
 |from scipy import signal as sg|
 |import fastai|
 |from dask import dataframe as dd|
 |from sklearn.ensemble import GradientBoostingClassifier|
 |from sklearn.preprocessing import PolynomialFeatures|
 |import awswrangler as wr|
 |import altair as alt|
 |from openpyxl import load_workbook|
 |import sys|
 |import pickle|
 |from sklearn.model_selection import RandomizedSearchCV|
 |from sklearn.ensemble import RandomForestClassifier|
 |import bokeh|
 |import imutils|
 |import matplotlib as mpl|
 |import cv2|
 |from sklearn.linear_model import Ridge|
 |import re|
 |import statistics|
 |import plotly.express as px|
 |from sklearn.cluster import KMeans|
 |import numpy as np|
 |from sklearn.linear_model import Lasso|
 |from fbprophet import Prophet|
 |import gensim|
 |import nltk|
 |import textblob|
 |from sklearn.model_selection import KFold|
 |import matplotlib.pyplot as plt|
 |from sklearn.impute import SimpleImputer|]
