# An analysis of the influence of Black Southern Churches on the Southern Black Community

## Background:

Documenting the American South is one of the longest running digital publishing initiatives at the University of North Carolina. It was designed to give researchers digital access to some of the library’s unique collections in the form of high quality page scans as well as structured, corrected and machine readable text. (https://docsouth.unc.edu/docsouthdata/)

## Goal: 

Analyze rhetoric of the Black Southern Church in the American South and it's effects on documents written by self emancipated and previously enslaved Black people.

## Research question:

Are there any measurable similaries between the themes of documents from Southern Black churches and the documents from self-emancipated and freed Black people? 

## Approach:

[Documenting The American South](https://docsouth.unc.edu/) is one of the longest running efforts by the University of North Carolina to collect, digitize, and publish documents from self-emancipated and freed Black people. Using [DocSouth Data](https://docsouth.unc.edu/docsouthdata/) and data from the [Religious Text Content Guide](https://docsouth.unc.edu/neh/religiouscontent.html), analyze the correlation between autobiographies, biographies, church documents, sermons, histories, encyclopedias, and other published materials from Southern Black churches and the narratives of slaves in regards to religion.

Proposed analysis:

- Evaluate language of both data sets
- Evaluate themes of both data sets
- Using data from the [Religious Text Content Guide](https://docsouth.unc.edu/neh/religiouscontent.html), develop a theme predictor for texts
- Evaluate how much (if any) thematic overlap there is between religious texts and slave narratives

## Repo contents:
- [SouthernBlackChurchRhetoric](https://github.com/jaded-gloryy/doc-south-analysis/blob/main/SouthernBlackChurchRhetoric.ipynb) contains all analysis.
- [Scrape_website](https://github.com/jaded-gloryy/doc-south-analysis/blob/main/scrape_website.py) contains the functions used to obtain [Religious Text Content Guide](https://docsouth.unc.edu/neh/religiouscontent.html) data.
- [Utils](https://github.com/jaded-gloryy/doc-south-analysis/blob/main/utils.py) contains generic functions for parsing an html document.


### Data preprocessing

##### Get content guide data

Data was scraped from [religious text content guide data](https://docsouth.unc.edu/neh/religiouscontent.html).

In [None]:
# import data scraped from the web and turn it into a df
import pandas as pd
from pandas import DataFrame as df
from scrape_website import scrape_data, custom_filter, build_dict
url = "https://docsouth.unc.edu/neh/religiouscontent.html"
tag_list = scrape_data(url=url, filter=custom_filter)
data_dict = build_dict(tag_list=tag_list)
content_guide_data = df.from_dict(data_dict)
content_guide_data = content_guide_data.dropna(subset="page_link")
content_guide_data

In [None]:
# there's a typo in this table. Replace title with appropriate year
non_num_yrs = content_guide_data["year"].str.isnumeric() == False
content_guide_data[non_num_yrs]

change_dict = {
        "Offley, G. W. (Greensbury Washington":"1859",
        "Latta, M. L. (Morgan London":"1903",
        "Jamison, M. F. (Monroe Franklin":"1912",
        "Brinch, Boyrereau and Prentiss, Benjamin F. (Benjamin Franklin":"1817",
        "Foster, G. L. (Gustavus Lemuel":"1860",
        "Bradford, Sarah H. (Sarah Hopkins":"1869",
        "Green, J. D. (Jacob D.":"1864",
        "E. M. W. (Elizabeth Merwin Wickham":"1869"
    }

# replace years
content_guide_data["year"] = content_guide_data["year"].replace(change_dict)


In [None]:
# replace remaining non-numeric years
# find years "publ.?" where title says "Experience and Personal Narrative of Uncle Tom." and replace with "1854"
publ_string = content_guide_data["year"].str.contains("publ.?")
string_1854 = "Experience and Personal Narrative of Uncle Tom"
find_1854 = content_guide_data["title"].str.contains(string_1854)
publ_1854 = content_guide_data[publ_string & find_1854]

# update values that should say 1854
content_guide_data["year"].loc[publ_1854.index] = "1854"

# find years "publ.?" where title says "Sketch of the Life of Mr. Lewis Charlton" and replace with "1870"
string_1870 = "Sketch of the Life of Mr. Lewis Charlton"
find_1870 = content_guide_data["title"].str.contains(string_1870)
publ_1870 = content_guide_data[publ_string & find_1870]

# content_guide_data_updated = content_guide_data[publ_string & find_1870].replace("publ.?","1870")
content_guide_data["year"].loc[publ_1870.index] = "1870"


In [None]:
# get text files from links in page column
from config import CONFIG
content_guide_data.to_csv(CONFIG["CGD_FILEPATH"])

#### Get text data from the documents in the content guide

##### The next 3 cells contain code to grab text from specific pages of each document in the content_guide_data.

The output of the code below is saved in thematic_text.txt. Since this only needs to be performed once, the code is commented out.

In [None]:
# # get text files from links in page column
# from scrape_website import combine_url,get_pages_from_url, get_page_list
# base_url = "https://docsouth.unc.edu"
# #replace roman numeral ranges and typo from website
# content_guide_data = content_guide_data.replace({"page_numbers":{"iv-viii":"iv,v,vi,vii,viii", "vi-viii":"vi,vii,viii","28-19":"28-29"}})
# page_numbers = content_guide_data["page_numbers"]
# page_urls = content_guide_data["page_link"]

# full_links = combine_url(base_url=base_url, specific_url=list(page_urls))
# page_num_list = list(page_numbers)

# # get a list of pages
# page_ranges = []
# for page in page_num_list:
#     page_ranges.append(get_page_list(page))
# # manual page update for 
# page_ranges[374] = ["9","16"]

# thematic_texts = get_pages_from_url(urls=full_links, pages=page_ranges)

# with open('thematic_text.txt', 'w') as f:
#     for line in thematic_texts:
#         f.write(line)
#         f.write("\n")

In [None]:
# # manual text update
# thematic_texts[45] = """

# About Christmas, my master would give four or five days' holiday to his slaves; during which time, he supplied them plentifully with new whiskey, which kept them in a continual state of the most beastly intoxication. He often absolutely forced them to drink more, when they had told him they had had enough. He would then call them together, and say, "Now, you slaves, don't you see what bad use you have been making of your liberty? Don't you think you had better have a master, to look after you, and make you work, and keep you from such a brutal state, which is a disgrace to you, and would ultimately be an injury to the community at large?" Some of the slaves, in that whining, cringing manner, which is one of the baneful effects of slavery, would reply, "Yees, Massa; if we go on in dis way, no good at all."

#         Thus, by an artfully-contrived plan, the slaves themselves are made to put the seal upon their own servitude. The masters, by the system, are rendered as cunning and scheming as the slaves themselves.

#         "Joe," said a master, "if you will work well for me, you shall be buried in my grave." The slave said nothing, in reply; but thought, Massa is a bad man, and that he would not like to be buried near him. The slave thought he had been too near his master, all his life, and had rather be away from him, when he died. Seeing the slave idling, "Joe," shouted his master, "have you forgotten what I promised you, if you work well?" "No, Massa, me bemember; but me don't want." "What for, Joe?" "Because de debbil might some day come, and steal me away, in mistake for you, Massa." His master was silent on this subject ever afterwards.
# """

The output of the code below is saved in cleaned_thematic_text.txt. Since this only needs to be performed once, the code is commented out.

In [None]:
# # clean up and save output to a txt for future use
# from scrape_website import clean_up_text
# cleaned_thematic_texts = []
# for text in thematic_texts:
#     new_text = clean_up_text(text)
#     cleaned_thematic_texts.append(new_text)

# # save cleaned text to a txt file
# with open('cleaned_thematic_text.txt', 'w') as f:
#     for line in cleaned_thematic_texts:
#         f.write(line)
#         f.write("\n")
# cleaned_thematic_texts