Oregon Trail Simulation and Dataset Generation: Probability of Fatal Illness Among Pioneers on the Trail

Create dataset relevant to illness stats and Oregon Trail Landmarks:


Illness Specifics

https://www.nps.gov/articles/000/death-on-trails.htm

'Nearly one in ten emigrants who set off on the trail did not survive.'

'It is estimated that 6-10% of all emigrants of the trails succumbed to some form of illness.'

'Of the estimated 350,000 who started the journey, disease may have claimed as many as 30,000 victims.'

Cholera: This disease resulted in more illness and death than all of the other maladies experienced by the emigrants. Cholera results from a waterborne bacteria that thrives in polluted, stagnant water. It progresses rapidly and attacks the intestinal lining, producing severe diarrhea, vomiting, abdominal pain, and cramps. The effects were so severe and rapid that victims often died within 12 hours of the first symptoms. Some of the medicines that emigrants had to combat cholera were camphor and laudanum. These were painkillers and cough suppressants that did little to cure the disease.

Dysentery: A common ailment that can strike any group exposed to changes in their living habits, especially if accompanied by unsanitary conditions. Although seldom fatal if treated, it can be very dangerous for the very young and elderly. Castor oil was used to treat dysentary and other bowel disorders.

Mountain fever: Usually not fatal, with symptoms that include intestinal discomfort, respiratory distress, and fever. The diseases that fit these symptoms are: Rocky Mountain spotted fever, typhoid fever, and scarlet fever. Quinine water was used to treat Rocky Mountain spotted fever, chills and pneumonia.

Measles: A viral disease that is more common among children, but can have a serious effect upon adults.

Food poisoning: A problem with contaminated food, more likely among single men.

Scurvy: Weakens and deteriorates body conditions resulting from diets lacking in vitamin C. Citric acid was used to prevent and treat scurvy.

Smallpox: A viral disease that was very contagious causing high fever and dehydration.

Pneumonia: A respiratory ailment that is common among groups experiencing unsanitary conditions or exposure to drastic weather changes.

Headaches, coughs, muscle aches: Turpentine, vinegar and whiskey were some of the treatments to treat these ailments.


Landmark Specifics

https://www.history.com/topics/19th-century/oregon-trail#oregon-trail-route

'The trail was arduous and snaked through Missouri and present-day Kansas, Nebraska, Wyoming, Idaho and finally into Oregon.'

'...for the most part, settlers crossed the Great Plains until they reached their first trading post at Fort Kearny, Nebraska, averaging between ten and fifteen miles per day.'

'From Fort Kearney, they followed the Platte River over 600 miles to Fort Laramie, Wyoming, and then ascended the Rocky Mountains where they faced hot days and cold nights.'

'The settlers gave a sigh of relief if they reached Independence Rock—a huge granite rock in Wyoming that marked the halfway point of their journey'

'After leaving Independence Rock, settlers climbed the Rocky Mountains to the South Pass. Then they crossed the desert to Fort Hall, the second trading post.'

'From there they navigated Snake River Canyon and a steep, dangerous climb over the Blue Mountains before moving along the Columbia River to the settlement of Dalles and finally to Oregon City.'


https://traveltips.usatoday.com/landmarks-stops-oregon-trail-10474.html

'Alcove Spring and its waterfall near Blue Rapids, Kansas, are part of a 300-acre park located near Independence Crossing, a popular camping ground for emigrants traveling the Oregon Trail who were waiting to ferry across the Big Blue River.'


https://www.oldwest.org/oregon-trail-landmarks/

'Ash Hollow was a favorite campsite, offering plenty of wood, clean water, and ample grazing for their animals.'

'Located near present-day Bridgeport, Nebraska, these hulking pillars still mark the way for travelers on State Route 88.'

'Emigrants could see Chimney Rock on the horizon for days before they reached it.'

'The next landmark travelers looked for was Scotts Bluff, a large rock plateau along the North Platte River.'
'...the bluff was one of the most commonly mentioned in emigrant journals.'

'Near where Sublette’s Cutoff and the main trail reunited, travelers came upon Soda Springs in the Bear River Valley.'

'After the South Pass, most travelers continued southwest on the main trail past Fort Bridger.'


In [1]:
#initial datasets: landmark dataframe, list of illnesses and probabilities (thumb in the wind + given stats)
import pandas as pd
url = 'https://raw.githubusercontent.com/A-Bin1/Statistical-Samples/main/landmark_data_oregon_trail_simulation.csv'
landmarkdf = pd.read_csv(url)

illnessdf = pd.DataFrame({'Illness': ['Cholera', 'Dysentery', 'Mountain fever', 'Measles', 'Food poisoning', 
'Scurvy', 'Smallpox', 'Pneumonia', 'Headaches, coughs, muscle aches'], 'FatalProb': [0.5, 0.2, 0.02, 0.06, 0.05, 0.04, 0.07, 0.05, 0.01]})


In [2]:
illnessdf

Unnamed: 0,Illness,FatalProb
0,Cholera,0.5
1,Dysentery,0.2
2,Mountain fever,0.02
3,Measles,0.06
4,Food poisoning,0.05
5,Scurvy,0.04
6,Smallpox,0.07
7,Pneumonia,0.05
8,"Headaches, coughs, muscle aches",0.01


In [3]:
landmarkdf
#supplies column indicates supplies generally found at forts or a known clean water site

Unnamed: 0,Landmark,Order,Supplies
0,"Independence, MO",1,Y
1,Missouri River,2,N
2,Alcove Spring,3,N
3,"Big Blue River, Kansas",4,N
4,"Fort Kearny, NE",5,Y
5,Platte River,6,N
6,"Ash Hollow, Nebraska",7,Y
7,"Chimney Rock, NE",8,N
8,Scotts Bluff,9,N
9,"Fort Laramie, WY",10,Y


In [558]:
#geocode for coordinates
#https://thepythoncode.com/article/get-geolocation-in-python#Reverse_Geocoding
import numpy as np
import geopy.distance
from geopy.geocoders import Nominatim
import time
from pprint import pprint

def getCoords(lm):
    app = Nominatim(user_agent="tutorial")
    landmarks = list(lm)
    coords = []
    for l in landmarks:
        coord = app.geocode(l).raw
        coords.append((coord['lat'], coord['lon']))
    return coords



#miles traveled for each landmark
def getMilesPerStop(cl):
    listc = getCoords(cl)
    mileList = []
    for i in range(len(listc)):
        if i-1 in range(len(listc)):
            #adjusted for innaccurate placement of coordinate data for general landmark names (trail took 4-6 months)
            mileList.append(round(geopy.distance.geodesic(listc[i-1], listc[i]).miles,2)/2.4)
        else:
            mileList.append(0)
    return mileList



In [180]:
def getDistanceFromAid(milenumber):
    suppliesYN = landmarkdf['Supplies']
    landmarkList = landmarkdf['Landmark']
    suppliesList =list(landmarkList.loc[suppliesYN == 'Y'])
    supplyMilesList = getMilesPerStop(suppliesList)
    totalSupplyMiles = np.cumsum(supplyMilesList) # returns a np.ndarray
    totalSupplyMilesList = list(totalSupplyMiles)
    indexRange = len(totalSupplyMilesList)-1
    for m in range(len(totalSupplyMilesList)):
        #accounting for travel past all supply stops
        if milenumber > totalSupplyMilesList[indexRange]:
            lastSupply = -1*(milenumber - totalSupplyMilesList[indexRange])
            break
        #accounting for travel between supply stop (0) and supply stop 1
        if milenumber-totalSupplyMilesList[m] == milenumber and totalSupplyMilesList[m+1] - milenumber > milenumber:
            startSupply = -1*milenumber
            break
        #accounting for travel halfway or more between stops
        elif milenumber - totalSupplyMilesList[m] >= totalSupplyMilesList[m+1] - milenumber and totalSupplyMilesList[m+1] - milenumber > 0:
            nextSupply = totalSupplyMilesList[m +1] - milenumber
            break
        elif milenumber - totalSupplyMilesList[m] < totalSupplyMilesList[m+1] - milenumber:
            #distance past the supply stop
            lastSupply = -1*(milenumber - totalSupplyMilesList[m])
            break


    if 'startSupply' in locals():        
        closestSupply = startSupply
    elif 'nextSupply' in locals():
        closestSupply = nextSupply
    elif 'lastSupply' in locals():
        closestSupply = lastSupply

    return(round(closestSupply,1))

In [185]:
getDistanceFromAid(120)

63.3

In [74]:
#generate wagons of people
import random as rand
from random import randrange

def genWagons(n):
    wagon_party = {}
    for wagon in range(n):
        key = 'Wagon'+ str(wagon+1)
    #exclude 0 from generator - wagons of dead people can't get any more dead
        wagon_party[key] = randrange(1,10,1)
    return wagon_party

genWagons(4)

{'Wagon1': 3, 'Wagon2': 6, 'Wagon3': 4, 'Wagon4': 8}

In [440]:
def t_f(prob):
    return rand.random() <= prob

def deathByIllness(illness, distance, wagonparty):
    # P(Death | Clean Water and Supplies Nearby | Fatal Illness Prob| Contracting Illness)
    supplydistance = np.abs(getDistanceFromAid(distance))
    diseaselist = illnessdf['Illness']
    fatalproblist = illnessdf['FatalProb']
    #death = t_f(0.1)
    contractIllness = t_f(0.1*wagonparty)
    #print(contractIllness)
    if contractIllness == True and illness == 'Headaches, coughs, muscle aches' and supplydistance <= 200: #death by aches wasn't a thing
        suppliesfatalProb = t_f(0) #you will live
    elif contractIllness == True and supplydistance <= 10:
        suppliesfatalProb = t_f(0.8) #you will probably die
    elif contractIllness == True and supplydistance > 10:
        suppliesfatalProb = t_f(1) #you will die
    else:
        suppliesfatalProb = t_f(0.1) #you probably will live
    
    fatalprob = t_f(fatalproblist.loc[diseaselist == illness])
    
    fate = contractIllness & suppliesfatalProb & fatalprob
    fate = fate.reset_index(drop=True)
    return (fate[0])

In [444]:
deathByIllness('Cholera', 80, 9)

True

Create the Simulation. 

Note:

It is possible for a wagon party member to die at Independence, MO in this simulation.

It is also possible for an empty wagon to arrive to Oregon.

In [572]:
wagonfDict = genWagons(10)
landmarkList_f = landmarkdf['Landmark']
diseaselist_f = illnessdf['Illness']
mileList_f = getMilesPerStop(landmarkList_f)
totalMiles_f = round(sum(mileList_f),0)
stopsPerTravelPeriod_f = np.arange(0, totalMiles_f, 67) #total miles divided into even increments
fateResults = {}
wagonDictUpdate = {}
for wagon in wagonfDict:
        #print(wagon)
        originalparty = wagonfDict[wagon]+1
        newparty = wagonfDict[wagon]
        while newparty in range(1,originalparty):
            for i in range(len(stopsPerTravelPeriod_f)):
                disRandInd = randrange(len(diseaselist_f)-1) #index for selecting random disease
                randomDisease = diseaselist_f[disRandInd]
                frKey = str(wagon) + str(stopsPerTravelPeriod_f[i])
                wduKey = str(wagon) + str(stopsPerTravelPeriod_f[i])
                print('Pioneer Count: ' + str(newparty), wagon, 'Mile Number: ' + str(stopsPerTravelPeriod_f[i]), end='\r')                   
                if deathByIllness(randomDisease, stopsPerTravelPeriod_f[i], newparty) == True:
                    print(str(randomDisease) + '  killed pioneer in ' + str(wagon) + ' at ' + str(stopsPerTravelPeriod_f[i]) + ' miles.'
                    + """
                    Updating the wagon party...""")
                    newparty = newparty-1
                    fateResults[frKey] = (randomDisease, stopsPerTravelPeriod_f[i], newparty, wagon)
                    
                    wagonDictUpdate[wduKey] = (newparty)

                else:
                    wagonDictUpdate[wduKey] = (newparty)
                    fateResults[frKey] = ('No Illness', stopsPerTravelPeriod_f[i], newparty, wagon)

                if newparty == 0 and stopsPerTravelPeriod_f[i] < stopsPerTravelPeriod_f[-1]:
                        print('Everyone died before you reached Oregon.')
                        break
                elif newparty == 0 and stopsPerTravelPeriod_f[i] == stopsPerTravelPeriod_f[-1]:
                        print('Congratulations. Your party died, but at least your wagon made it to Oregon!')
                        break
                elif newparty != 0 and stopsPerTravelPeriod_f[i] == stopsPerTravelPeriod_f[-1]:
                        print('Congratulations! ' + str(newparty) + ' of you in ' + str(wagon) + ' made it to Oregon!')
                        newparty = 0
                        break
            
              


Dysentery  killed pioneer in Wagon1 at 938.0 miles.
                    Updating the wagon party...
Food poisoning  killed pioneer in Wagon1 at 1005.0 miles.
                    Updating the wagon party...
Cholera  killed pioneer in Wagon1 at 1273.0 miles.
                    Updating the wagon party...
Dysentery  killed pioneer in Wagon1 at 1340.0 miles.
                    Updating the wagon party...
Cholera  killed pioneer in Wagon1 at 1541.0 miles.
                    Updating the wagon party...
Measles  killed pioneer in Wagon1 at 1809.0 miles.
                    Updating the wagon party...
Congratulations! 3 of you in Wagon1 made it to Oregon!
Cholera  killed pioneer in Wagon2 at 0.0 miles.
                    Updating the wagon party...
Cholera  killed pioneer in Wagon2 at 670.0 miles.
                    Updating the wagon party...
Cholera  killed pioneer in Wagon2 at 2010.0 miles.
                    Updating the wagon party...
Congratulations! 6 of you in Wagon2 made it to O

In [573]:
#unpack the fateresults dictionary
fateResults_df = pd.DataFrame(fateResults.values()) 
fateResults_df = fateResults_df.rename(columns={0: 'ContractIllness', 1: 'MileNumber', 2: 'WagonParty', 3:'WagonNumber'})

In [577]:
fateResults_df.head(60)

Unnamed: 0,ContractIllness,MileNumber,WagonParty,WagonNumber
0,No Illness,0.0,9,Wagon1
1,No Illness,67.0,9,Wagon1
2,No Illness,134.0,9,Wagon1
3,No Illness,201.0,9,Wagon1
4,No Illness,268.0,9,Wagon1
5,No Illness,335.0,9,Wagon1
6,No Illness,402.0,9,Wagon1
7,No Illness,469.0,9,Wagon1
8,No Illness,536.0,9,Wagon1
9,No Illness,603.0,9,Wagon1
