# Aggregrate Sentence Splitting

This notebook runs `SentenceSplitting.ipynb` for the years initiated below to concat and make a final, all encompassing, Pandas dataframe.
<br>
NOTE: The `%%cpature` command at the top of some cell is to avoid displaying the output messages.

In [1]:
import pandas as pd
import sys
from tqdm import tqdm  # For printing out progress bar

In [2]:
# Define the list of years to aggregate
years = [1892, 1901]

In [3]:
# A dictionary to count the number of errors for all years
errorCountsAgg = {}

In [4]:
%%capture cap --no-stderr

# Create an empty dataframe
df_final = pd.DataFrame()

# Set up the progress bar
progress_bar = tqdm(total=len(years), file=sys.stderr)

# Iterate over the list
for year in years:

    # The %store command lets you pass variables between two different notebooks.
    # Store the year so that it can be picked up by the other notebook
    %store year

    # Run the notebook
    %run SentenceSplitting.ipynb

    # All variables, including the final dataframe,
    # should now be available in this notebook's scope.

    # Append this year's dataframe to the final dataframe
    df_final = pd.concat([df_final, df_cleaned])
    
    # Loop over this year's error counting dictionary and
    # update the overall error counting dictionary
    for key, value in errorsDict.items():
        try:  # If some value exists, append the new value to it
            errorCountsAgg[key] += value
        except KeyError:  # Else, use this value as the initialization value
            errorCountsAgg[key] = value

    # Update the progress bar
    progress_bar.update(1)
    progress_bar.set_description(f"Processing year {year}")

# Close the progress bar
progress_bar.set_description(f"Processed the list")
progress_bar.close()

Processed the list: 100%|█████████████████████████| 2/2 [00:04<00:00,  2.30s/it]


In [5]:
errorCountsAgg

{'section identifiers': 414,
 'EOL hyphenation': 3852,
 'Approved phrases': 171,
 'Act seperators': 12,
 'Incorrect starting nums': 706}

In [6]:
df_final

Unnamed: 0,id,law_type,state,sentence,length,start_page,end_page
0,1892_0000,Acts,SOUTH CAROLINA,AN ACT to CoNSsTITUTE A BATTALION TO BE KNOWN AS THE NAVAL BATTALION OF VOLUNTEER TROOPS OF SOUTH CAROLINA.,109,045,045
1,1892_0001,Acts,SOUTH CAROLINA,"Be tt enacted by the Senate and House of Representatives of the State of South Carolina, now met and sitting in General Assembly, and by the authority of the same, That there shall be allowed, in addition to the companies of the Vol-t unteer Troops of the State of South Carolina as now provided by law, not more than four companies of Naval Militia, which shall constitute a battalion, to be known as the Naval Battalion of the Volunteer Troops of South Carolina.",477,045,046
2,1892_0002,Acts,SOUTH CAROLINA,"The officers of this battalion shall consist of a Lieutenant Commander, who shall be appointed by the Governor, and whose rank and pay shall assimilate to that of a Major of infantry, and a staff, to consist of one Adjutant, one Ordnance Officer, one Paymaster, who shall be the mustering officer, and one Surgeon, each with the rank of First Lieutenant.",356,046,046
3,1892_0003,Acts,SOUTH CAROLINA,They shall be paid the same as battalion staffs in the Volunteer Troops.,72,046,046
4,1892_0004,Acts,SOUTH CAROLINA,"There shall also be attached to the staff the following petty officers: One Master-at-Arms, two Yeomen, one Hospital Steward, one Chief Bugler, who shall receive the same pay as the non-commissioned staff of a battalion of infantry.",232,046,046
...,...,...,...,...,...,...,...
1330,1901_1330,Acts,SOUTH CAROLINA,"lands adjacent to said River; And whereas by the construction of said dam or dams the navigation of said River may be increased and the public interest promoted by the construction thereof for the purpose and for the sake of such improvement in the navigability of said River and for the public purposes to be ,fulfilled and encouraged by the construction of said dam or dams and for the purpose of removing any doubt which may arise as to the power and authority of the Secretary of State in granting the charter to the said Twin City Power Company for the erection of said dam or dams to be built across the said River: Now, Section 1. Be it enacted by the General Assembly of the State of South Carolina: That the right, power and privilege to construct and maintain a dam or dams across the Savannah River, as hereinbefore mentioned, to Twin City Power Company, its successors or assigns, shall be and is hereby fully authorized, ratified and confirmed; and that the said Twin City Power Company shall have all rights, powers and privileges conferred for the purpose of the acquisition and condemnation of land which may be overflowed by the erection or construction of said dam or dams as are conferred by Sections 1743-1755, inclusive, of the Revised Statutes of South Carolina, 1893, upon railway, canal and turnpike companies in the State and all of the Acts amendatory thereof; it being the intention of this Act for the sake of the public purposes intended to be carried out by said company to confer upon it all the rights, privileges and authorities conferred by the laws of this State upon railway, canal and turnpike companies in the acquisition and condemnation of property for rights of way or other interests in lands.",1751,00291,00291
1331,1901_1331,Acts,SOUTH CAROLINA,"Approved the 2oth day of February, A. D. 1901 AN ACT To EMPOWER AND AUTHORIZE THE CouNnTY BoarD oF COMMISSIONERS OF CHEROKEE COUNTY TO BUILD A BRIDGE ACROSS BroAaD RIVER AND Borrow MONEY THEREFOR FROM THE CoMMISSIONERS OF THE SINKING FUND.",243,00291,00292
1332,1901_1332,Acts,SOUTH CAROLINA,"Be it enacted by the General Assembly of the State of South Carolina: That the County Board of Commissioners of Cherokee County be, and they are hereby, authorized, if in their discretion they deem that it is for the best interest of said County, to borrow a sum of money from the Sinking Fund of the State of South Carolina, not to exceed ten thousand dollars, at a rate of interest not to exceed five per centum per annum, for the purpose of building a bridge across Broad River, in said County, at such point on said river as they may deem most practicable, and a special tax of one-half mill on the dollar may be levied on all taxable property in the County of Cherokee, provided the Board of Commissioners so decide to build said bridge, for the said period of seven years, for the purpose of repaying said loan.",836,00292,00292
1333,1901_1333,Acts,SOUTH CAROLINA,"That the proceeds of said levy of onehalf mill shall be paid each year on said loan until the seventh year, in which year the balance remaining due on said loan shall be paid from said special levy, if any remain it shall-be turned into the County Treasury for ordinary County purposes, and if a’sufficient sum has not been realized by said special levy at the expiration of said seven years the deficiency shall be paid by the County Board of Commissioners out of the ordinary County funds.",493,00292,00292
