In [1]:
import pandas as pd
import yaml

## Plan

### Here are the steps you need to complete to make this function:
-  [x] open the .yaml file with tree properties
    - [x] use these tree properties to assign a death year for each tree based on (1) species and (2) age/year planted
    
- [x] develop the architecture of the .yaml file with strategies. What will this look like? What information will it hold? How flexible/rigid will it be (at first)?

- [x] first off, based on the initial data, determine the death year of all of the trees. append this to a column 'death year' in the new dataframe

- [x]  start by making a function that simply creates the new event dataframe assuming no strategies are implemented. I.e. all the rows will have status of either "plant" or "replant." End year is determined by the death year, and replant is simply determined by the death year + 1.

- [ ]  add to that function the ability to add pruning, so the status will now be "plant," "replant," or "prune"
    * pruning will have to be implemented with some sort of year crieteria (in what year, or even better, at what age, should farmers prune?)
    * with the new row that represents the pruning event, death year carries over but is altered slightly (because we are working under the assumption that pruning increases production life)
    * unless another event is called, end year is death year
    
-  [ ] add to that function the ability to add intercropping.
    * intercropping will be implemented with (1) a year criteria and (2) a proportion criteria (what proportion of the trees will be replanted). 
    * if the trees in the originaal dataset reach the year or age criteria, the original row will become inactive because the 'end year' has been reached. The row will then be split into two rows (but both still show the same ID number, which is how we identify them as the same plot). The first row will be the trees that were left alone, and so it will be the exact same except that is has proportionally less cuerdas (the death year will carry over from the original). The second row will have the trees which were replanted. Their


In [2]:
def openYaml(yamlFilePath : str) -> dict: 
    yamlFile = open(yamlFilePath)
    parsed = yaml.load(yamlFile, Loader =yaml.FullLoader)
    return(parsed)

In [3]:
# open the yaml files  to assign attirbutes to the sim
treeAttributes = openYaml("data/trees.yml")
strategyAttributes = openYaml("intervention/strategy1.yml")

In [4]:
# initialData = pd.read_csv("data/demoData.csv")
upData = pd.read_csv("data/fakeData.csv")
upData.head()

Unnamed: 0,plotID,farmerName,treeType,numCuerdas,yearPlanted,ageOfTrees
0,0,Ing. José Emilio Godínez,catuai,0,2010,10
1,1,Sr(a). Rodrigo Rael,catuai,3,2020,0
2,2,Dr. Graciela Urías,borbon,9,1996,24
3,3,Ing. Leticia Villareal,borbon,2,1992,28
4,4,Teresa Francisco Javier Ocampo Mejía,borbon,8,1999,21


In [5]:
# yearPlanted = 2020 - initialData['ageOfTrees']
# initialData['yearPlanted'] = yearPlanted
# print(initialData.keys())
# new = initialData.rename(columns={'Unnamed: 0': 'plotID'})
# new.to_csv("data/demoDataUpdated.csv",  index = False)

In [6]:
def isolateAttributes(attributes:dict, treeType:str):
    """
    Takes in a dictionary containing all of the tree attributes, as well as the name of
    the tree type, then returns a dictionary with only the attributes of the tree
    `treeType`.
    """
    
    keys = list(treeAttributes.keys())
    altOrth = [treeAttributes[key]['altOrth'] for key in treeAttributes]
    # tipos = keys + altOrth # all of the possible spellings for the tree types
            
    if treeType in keys:
        treeDict = treeAttributes[treeType]
                
    elif treeType in altOrth:
        keyPair = [(key, treeAttributes[key]['altOrth']) for key in treeAttributes]
        _treeType = ''
        for i,e in enumerate(keyPair):
            if treeType == e[1]: # if it's the altOrth
                _treeType = e[0] # key to the key
                
            
            if len(_treeType) > 0:
                treeDict = treeAttributes[_treeType]
                
            else:
                raise AttributeError(
                """
                '%s' is not a recognized value (orthography) in the `treeAttributes` dict.
                
                """%(treeType))
                
    else:
        raise AttributeError(
        """
        '%s' is not a recognized value (orthography) in the `treeAttributes` dict.
                
        """%(treeType))
        
    return(treeDict)

In [7]:
def transformData(year:int,
                  simulationYears:int,
                  farmData:pd.DataFrame,
                  treeAttributes:dict=None,
                  strategyAttributes:dict=None):
    """
    takes in data from repository and returns a new,  transformed dataframe that
    tracks events.
    
    year is an int of the  year where the simulation starts.  if the simulation moves forward
    from  the present, the year is the current year. else it is  the  year the simulation
    begins.
    
    simulationYears is the amount of years that the simulation will iterate through. This
    is necessary to make sure the transformed data only captures events within this range. 
    
    farm data is data frame with farmer's plots
    
    tree attributes is dictionary opened from yaml file with attritbutes of trees.
    
    strategy attributes is dictionary opened from yaml file with attributes of strategies.
    
    returns dataframe with events
    
    
    Notes
    -----
    
    as of now, the condition is that intercrop year and prune year are not in the same. but I might be able to figure out how to work that out. 
    """
    
    endYear = year + simulationYears
    
    # iterate through each row of the original plot dataframe
    for i in range(len(farmData)):
        plotID = farmData["plotID"][i]
        farmerName = farmData["farmerName"][i]
        treeType = farmData["treeType"][i]
        numCuerdas =  farmData["numCuerdas"][i]
        startYear = farmData["yearPlanted"][i]
        
        # assume all are planted for initialization
        status = "plant" 

        treeAge = year - startYear

        # check to see that this tree exists in config file
        # _altOrth = [treeAttributes[item]["altOrth"] for item in treeAttributes]
        
        if treeAttributes:
            # isolate the dictionary we are concerned with on this plot
            treeDict = isolateAttributes(attributes=treeAttributes, treeType=treeType)

            # isolate individual variables from this dict
            cuerdaHarvestCap  = treeDict["cuerdaHarvestCap"]
            firstHarvest = treeDict["firstHarvest"]
            fullHarvest = treeDict["fullHarvest"]
            descentHarvest = treeDict["descentHarvest"]
            death = treeDict["death"]

            # calculate death year
            yearsTillDeath = death["year"] - treeAge
            deathYear = year + yearsTillDeath

            # create the initial row for the transformed dataframe
            row = pd.DataFrame([[plotID, farmerName, treeType, numCuerdas, status, 
                                      startYear, deathYear]], columns=["plotID", "farmerName", "treeType", 
                                                "numCuerdas", "status", "startYear",
                                               "deathYear"])
           
            # if this is the first row of the whole transformation
            if (i == 0):
                # initialize the transformation dataframe
                transformedData = row

            else:
                transformedData = pd.concat([transformedData, row], ignore_index=True)
                #transformedData  = transformedData.reset_index(drop = True, inplace = True)
                
                
            # now you've transformed all of the original entries to the new format
            # now you should be iterating through transformed data to add events

            # check to see if replant is in strategy (it always should be)
            if strategyAttributes["replant"]["isReplant"] ==  True:
                replantYear = (deathYear + 1)
            else:
                replantYear = None

            # check to see if prune  is in strategy config
            if strategyAttributes["prune"]["isPrune"] ==  True:
                pruneAge = strategyAttributes["prune"]["age"]
                lifeExtend = strategyAttributes["prune"]["lifeExtend"]
            else:
                pruneAge = None

            # check to see if  intercrop is in strategy config
            if strategyAttributes["intercrop"] == True:
                intercropAge = strategyAttributes["prune"]["age"]
            else:
                intercropAge = None

            # for this specific plot (see plotID),
            # create a row to check against to see if the program needs to continue creating events
            checkRow = row

            # create a  new var for the year of transformation for this plot
            simYear = year
            # create a new var for the tree's age for this plot
            simTreeAge = treeAge

            # iterate through all years of the simulation to check event sequences
            while (simYear < endYear):
               #  isolate dict
                deathYear = checkRow["deathYear"][0]


                if (replantYear):
                    # death takes precedence over pruning 
                    if (simYear == deathYear):
                        # update death year
                        simTreeAge = -1
                        deathYear = (simYear +  1) + death["year"]
                        status = "replant"
                        nextRow = pd.DataFrame([[plotID, farmerName, treeType, numCuerdas, status, replantYear, deathYear]], 
                                            columns=["plotID", "farmerName","treeType", "numCuerdas", "status", "startYear","deathYear"])
                        transformedData  = pd.concat([transformedData, nextRow],  ignore_index=True)
                        checkRow = nextRow
                        simYear += 1
                        simTreeAge += 1
                        
                        # no more than one action per year IF action is death
                        continue


                    elif (pruneAge):
                        if (simTreeAge == pruneAge):
                            # add years proportional to tree's lifespan:
                            addedYears = round((death["year"] * lifeExtend))
                            adjustedDeathYear = (checkRow["deathYear"][0]) + addedYears
                            pruneYear = simYear
                            status = "prune"
                            nextRow = pd.DataFrame([[plotID, farmerName,treeType, numCuerdas, status, pruneYear, adjustedDeathYear]], 
                                                   columns=["plotID", "farmerName","treeType", "numCuerdas", "status", "startYear","deathYear"])
                            transformedData  = pd.concat([transformedData, nextRow],  ignore_index=True)
                            checkRow  = nextRow
                            simYear += 1
                            simTreeAge += 1
                            continue

                        else:
                            simYear += 1
                            simTreeAge += 1
                            continue
                            
                    else:
                        simYear += 1
                        simTreeAge += 1
                        continue

                else:
                    simYear += 1
                    simTreeAge += 1
                    continue



        else:
            print("No tree attributes!!!")
            print(treeType)
            break
            
        
    return(transformedData)

In [10]:
startYear = 2021

In [12]:
simData = transformData(year=startYear, simulationYears=30, farmData=upData, treeAttributes=treeAttributes, strategyAttributes=strategyAttributes)
simData.head(20)

Unnamed: 0,plotID,farmerName,treeType,numCuerdas,status,startYear,deathYear
0,0,Ing. José Emilio Godínez,catuai,0,plant,2010,2027
1,0,Ing. José Emilio Godínez,catuai,0,replant,2028,2045
2,0,Ing. José Emilio Godínez,catuai,0,prune,2036,2048
3,0,Ing. José Emilio Godínez,catuai,0,replant,2028,2066
4,1,Sr(a). Rodrigo Rael,catuai,3,plant,2020,2037
5,1,Sr(a). Rodrigo Rael,catuai,3,prune,2028,2040
6,1,Sr(a). Rodrigo Rael,catuai,3,replant,2038,2058
7,1,Sr(a). Rodrigo Rael,catuai,3,prune,2049,2061
8,2,Dr. Graciela Urías,borbon,9,plant,1996,2026
9,2,Dr. Graciela Urías,borbon,9,replant,2027,2057


In [9]:
# returns number of events based on plot ID
simData.groupby("plotID")["startYear"].count()

plotID
0     4
1     4
2     3
3     3
4     3
     ..
95    3
96    4
97    4
98    3
99    3
Name: startYear, Length: 100, dtype: int64

In [10]:
def groupByPlot(eventData:pd.DataFrame, plotID:int):
    """
    Takes a dataframe of farm events and isolates the rows relevant to the specific plot.
    
    Params: `eventData` (pd.DataFrame), `plotID` (int)
    
    Returns: `plot` (isolated pd.DataFrame)
    """
    plot = eventData[eventData["plotID"] == plotID]
    
    return(plot)

In [82]:
class Farmarelli:
    def __init__(self, 
                 eventRows:pd.DataFrame,
                 initialYear:int,
                 treeAttributes:dict=None,
                 sowDensity:int=333):
        
        # sort df by event start year (ascending) then zero-index
        sortedEvents = eventRows.sort_values(by=["startYear"])
        self.sortedEvents = sortedEvents.reset_index(drop=True)
        
        # assign initial year of simulation
        self.initialYear = initialYear
        # initialize current year to first year of simualtion
        self.currentYear = initialYear
        
        self.plotID = self.sortedEvents["plotID"][0]
        self.farmerName = self.sortedEvents["farmerName"][0]
        self.treeType = self.sortedEvents["treeType"][0]
        self.numCuerdas = self.sortedEvents["numCuerdas"][0]
        
        # initialize event variables
        self.initializeEvents()
        
        # assign tree attributes for growth constants
        self.inheretTreeProperties()
        
    def initializeEvents(self):
        """
        After core member variables have been assigned, call `initializeEvents` to setup the status of the 
        plot in the simulation upon instantiation.
        """
        # how many event entries there are for this plot
        self.numEvents = len(self.sortedEvents)
        
        # determine the initializing event
        self.initialEvent = self.sortedEvents.loc[0]

        # set the current event to the initializing event
        # (assumes simulation begins before second event)
        self.currentEvent = self.initialEvent
        self.currentEventNum = 0
        self.currentStatus = self.currentEvent["status"]
        
    def inheretTreeProperties(self, treeAttributes):
        
        """
        Based on the argument treeType in the initializer function, assign parameters for the respective
        life & production patterns of the trees on the cuerda.
        
        The following property assignments were developed from data collected from the co-op in 2014
    
        """
        self.treeType = self.treeType.lower() # convert to lower case for easier parsing
        
        if treeAttributes:
            keys = list(treeAttributes.keys())
            altOrth = [treeAttributes[key]['altOrth'] for key in treeAttributes]
            # tipos = keys + altOrth # all of the possible spellings for the tree types
            
            if treeType in keys:
                treeDict = treeAttributes[treeType]
                
            elif treeType in altOrth:
                keyPair = [(key, treeAttributes[key]['altOrth']) for key in treeAttributes]
                _treeType = ''
                for i,e in enumerate(keyPair):
                    if treeType == e[1]:
                        _treeType = e[0]
                
                treeDict = treeAttributes[_treeType]
            
            # if this tree type is not listed in the keys nor the alternative orthographies provided
            else:
                raise AttributeError(
                """
                '%s' is not a recognized value (orthography) in the `treeAttributes` dict.
                
                """%(treeType))
                
            # if necessary, re-assign treeType attirbute so the string orthography matches the dict key
            self.treeType = treeDict['treeType']
            self.firstHarvest = treeDict['firstHarvest']
            self.fullHarvest = treeDict['fullHarvest']
            self.descentHarvest = treeDict['descentHarvest']
            self.death = treeDict['death']
            self.cuerdaHarvestCap = treeDict['cuerdaHarvestCap']
            
        # if no dictionary was passed with the attributes
        else:
            # assign some default attributes
            self.firstHarvest = {'year': 4, 'proportion': 0.2} 
            self.fullHarvest = {'year': 5, 'proportion': 1.0}                  
            self.descentHarvest = {'year': 28, 'proportionDescent': 0.2}
            self.death = {'year': 30} 
            self.cuerdaHarvestCap = 200 # units, in this case lbs, per cuerda
            # should the name of the tree type change? see comm code below
            # self.treeType = 'default-param'
            
        
    def isUpdate(self):
        boolean = False
        nextEventNum = self.currentEventNum + 1
        
        deathYear = self.currentEvent["deathYear"][self.currentEventNum]
        nextStartYear = self.currentEvent["startYear"][nextEventNum]
        
        # first check whether the tree dies
        if (self.currentYear >= deathYear):
            boolean = True
        
        # then check if another event happens this year
        elif(self.currentYear >= nextStartYear):
            boolean = True
        # otherwise don't change the event
        else:
            boolean = False
            
        return(boolean)
    
    def updateEvent(self):
        if (self.isUpdate()):
            # raise the index so the new event can be put into action
            self.currentEventNum += 1
            self.currentEvent = self.sortedEvents[self.currentEventNum]
            
            self.currentStatus = self.currentEvent["status"]
            
            
    def assignStatus(self):
        if (self.currentStatus == "plant") or (self.currentStatus == "replant"):
            x = 0
            # set it up so the production and such is fit to match
            
        elif (self.currentStatus == "prune"):
            x = 1
            # set it up so the production and such is fit to match
            
        else:
            raise AttributeError(
                 """
                '%s' is not a recognized status in the scope of this program.
                
                """%(self.currentStatus))
        
        
        
        

In [79]:
testDf = groupByPlot(simData, 3)
#test = testDf.reset_index(drop=True)
testClass = Farmarelli(eventRows=testDf, initialYear=2020)
#fruit

testClass.year

AttributeError: 'Farmarelli' object has no attribute 'year'

In [56]:
test = testDf.reset_index(drop=True)
test

Unnamed: 0,plotID,farmerName,treeType,numCuerdas,status,startYear,deathYear
0,3,Ing. Leticia Villareal,borbon,2,plant,1992,2022
1,3,Ing. Leticia Villareal,borbon,2,replant,2023,2053
2,3,Ing. Leticia Villareal,borbon,2,prune,2031,2057


In [68]:
pd.DataFrame.loc?

[0;31mType:[0m        property
[0;31mString form:[0m <property object at 0x7fadf04b4818>
[0;31mDocstring:[0m  
Access a group of rows and columns by label(s) or a boolean array.

``.loc[]`` is primarily label based, but may also be used with a
boolean array.

Allowed inputs are:

- A single label, e.g. ``5`` or ``'a'``, (note that ``5`` is
  interpreted as a *label* of the index, and **never** as an
  integer position along the index).
- A list or array of labels, e.g. ``['a', 'b', 'c']``.
- A slice object with labels, e.g. ``'a':'f'``.

      start and the stop are included

- A boolean array of the same length as the axis being sliced,
  e.g. ``[True, False, True]``.
- A ``callable`` function with one argument (the calling Series or
  DataFrame) and that returns valid output for indexing (one of the above)

See more at :ref:`Selection by Label <indexing.label>`

Raises
------
KeyError
    If any items are not found.

See Also
--------
DataFrame.at : Access a single value for a 

Unnamed: 0,plotID,farmerName,treeType,numCuerdas,status,startYear,deathYear
0,0,Ing. José Emilio Godínez,catuai,0,plant,2010,2027
1,0,Ing. José Emilio Godínez,catuai,0,replant,2028,2045
2,0,Ing. José Emilio Godínez,catuai,0,prune,2036,2048
3,0,Ing. José Emilio Godínez,catuai,0,replant,2028,2066


# events --> class?
divorce state from class

state of class can still change, but an entry of data should be written for each iteration (each year of simulation for each plot)
because then every single factor can be tracked and accounted for in the simulation.

## What data might be included?
### For each year for each plot, record the following: 
* year (of simulation/iteration)
* plot ID
* farmer name
* tree type
* tree standard production
* cuerdas
* production cycle (none, partial, full, descent, death)
* proportion of production
* event? bool (True/False)
    * if event: event type (category)
* this year's production
* (tree death/disese/other factors???)
* anything else

## plan??

1. retain foundation of Farm class, but add accessor functions that write the state of the class as the simulation iterates. Keep track of all of the information above and/or anything that could be useful to have tracked.
2. ...