# Modelling herb runs in Old School Runescape

## 01-Background
Herbs are a type of plant that can be grown in the farming skill in Old School Runescape. Personally I find herb runs ("runs" referring to visiting each herb patch around the in-game map once to harvest the crop) perfect for training the farming skill as they are quick and cheap. The availablily of direct teleports to most of the herb patches allows any route to be taken from one patch to another, eliminating the need to determine an optimal route. The purpose of modelling (and perhaps simulating) herb runs is for optimsation of cost per xp as well as improving my fairly basic Python skills (mainly trying to learn numpy, pandas and matplotlib).

Plant growth in Old School Runescape works based off a discrete time system referred to as ticks. Each plant type (herbs, fruit trees, hops... etc.) has a set total growth time for the plant to be ready for harvesting. These set growth times are partitioned into different ticks as displayed at <https://oldschool.runescape.wiki/w/Farming#Growth_cycles>. Herbs operate on a tick of 20 minutes between each growth stage until a total of 80 minutes has been reached and the herb is ready to harvest. That is if the herb as not contracted a disease in any of the preceeding growth phases. Herb patches can be treated with three different tiers of compost that increase crop yield and decrease disease chance. These composts also increase the minimum possible amount of herbs that can be harvested from a single herb patch. 

So a simple simulation of a single herb patch will go like so:
0. The player plants a herb seed into a herb patch and treats it with compost. The zeroth growth stage of the herb will last until (any-hour):15, (any-hour):35 or (any-hour):55. So potentially an entire growth phase can be skipped if a herb is planted at (any-hour):14... etc.
1. The first growth stage is reached at the same times as specified in the previous step. A random number is generated to check whether the herb contracts a disease. If a disease is contracted the growth cycle will stop on the next farming tick where the plant dies. In this intermediate tick where the herb is diseased but not dead the herb can potentially be treated with a plant cure to cure the disease.
2. The second growth stage is reached and the previous step is repeated. 
3. The third and final growth stage is reached. If the herb has remained free of disease for the whole growth period the herb will stay harvestable indefinitely. 
4. The player who is on a herb run comes to harvest the fully grown herb. The minimum number of herbs is then harvested plus a variable amount extra which is known to be based on farming level and general random number generation. 
So this process can take a minimum of 61 minutes and a maximum of 80 minutes plus the time the player takes to get there and harvest in total to complete. 

In-game there are 9 available herb patches. Some of these patches have extra in-game requirements to unlock. For the purposes of my data collection I only have access to 8 out of 9 herb patches.

## 02-Theoretical Framework
If we define the extra amount of herbs harvested on a single herb patch as a random variable $X$ say, then we can consider modelling it as a negative binomial random variable. As a player continues harvesting extra herbs from the minimum possible value until the first failure. Each individual harvest of a herb can be treated as a Bernoulli trial with some unknown fixed probability of success $p$ say (which is rumored to be based on the type of herb used and the players farming level). Hence, this problem requires some sort of method to find this unknown probability of obtaining a single extra herb $p$. From this probability (or best estimate) we can construct the particular negative binomial distribution in question as $X\sim {NB}\left(1,p\right)$. From this distribution we can also do some exploratory simulation.

We can use the collected data to obtain a maximum likelihood estimate of $p$. To do so we need to contruct a likelihood function from the known probability mass function of a negative binomial random variable. In general if we let $Y\sim {NB}\left(r,p\right)$ then as listed on <https://en.wikipedia.org/wiki/Negative_binomial_distribution>, the PMF is
$$ f_Y(y,r,p) = P(Y=y) = \binom{y+r-1}{y} p^{y}(1-p)^{r}. $$

Now we need to consider switching to a likelihood based perspective. If we let $\theta$ be our best guess of $p$ based on the collected data labelled $\underline{y}=\left(Y_1=y_1, Y_2=y_2,\dots Y_n=y_n\right)$ for $n$ observations of herbs harvested. Then the probability that the chosen $\theta$ fits the individual observation is the same as above now with fixed $y_i$ (for $i$ labelling each data-point) in place of $p$
$$ l(\theta \ \vert \ Y_i=y_i) = P_{\theta} (Y_i=y_i) = \binom{y_i+r-1}{y_i}\theta^{y_i}(1-\theta)^{r}. $$

Which gives the joint probability of seeing this data $\underline{y}$ for given $\theta$ as the product of all of these individual PMF's
$$ L\left(\underline{y},r\right) = \prod_{i=0}^n \binom{y_i+r-1}{y_i} \theta^{y_i}(1-\theta)^{r}. $$

To find the maximum likelihood estimate based on our given data we need to maximise the above function in terms of choice of $\theta$. The $\theta$ that does so (what we called $p$) will be the most likely value of the probability of successfully picking a single herb. Maximising such a function is quite difficult so we can consider using Python packages to aid greatly.

From this estimate we can consider simulating herb runs and seeing how it matches the collected data.