Skip to content
Functions to comprise an R package that will facilitate the identification of (marine) important bird areas based on individual tracking data. Key functions include utilities to identify and summarise individual foraging trips, estimate utilisation distributions, and overlay distributions to identify important aggregation areas. Two utility func…
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Type Name Latest commit message Commit time
Failed to load latest commit information.


This package is comprised of functions that facilitate the identification of areas of importance for biodiversity, such as Key Biodiversity Areas (KBAs) or Ecologically or Biologically Significant Areas (EBSAs), based on individual tracking data. For further detail concerning the method itself, please refer to this paper by Lacelles et al. (2016).

Key functions include utilities to estimate individual core use areas, the level of representativeness of the tracked sample, and overlay individual distributions to identify important aggregation areas. Other functions assist in formatting your data set, splitting and summarizing individual foraging trips, and downloading data from Movebank.


You can download the development version from GitHub with:

# install.packages("devtools", dependencies = TRUE)
devtools::install_github("steffenoppel/track2iba", dependencies=TRUE) # add argument 'build_vignettes = FALSE' to speed it up


Now we will use tracking data collected at a seabird breeding colony to illustrate a track2KBA workflow for identifying important sites. It is important to note that the specific workflow you use (i.e. which functions and in what order) will depend on the species of interest and the associated data at hand.

First, in order for the data to work in track2KBA functions, we can use the formatFields function to format the important data columns needed for track2KBA analysis. These are: a datetime field, latitude and longitude fields, and an ID field (i.e. individual animal, track, or trip).

library(track2KBA) # load package

# ?boobies  # for some background on the data set 

tracks <- formatFields(boobies, 
  field_ID   = "track_id", 
  field_Date = "date_gmt", 
  field_Time = "time_gmt",
  field_Lon  = "longitude", 
  field_Lat  = "latitude"

If your data come from a central-place foraging species (i.e. one which makes trips out from a centrally-located place, such as a nest in the case of a bird), you can use tripSplit to split up the data into discrete trips.

In order to do this, you must identify the location of the central place (e.g. nest or colony).


# here we know that the first points in the data set are from the nest site
colony <- tracks %>% 
    Longitude = first(Longitude), 
    Latitude  = first(Latitude))

Our colony dataframe tells us where trips originate from. Then we need to set some parameters to decide what constitutes a trip. To do that we should use our understanding of the movement ecology of the study species. So in this case we know our seabird travels out to sea on the scale of a few kilometers, so we set InnerBuff (the minimum distance from the colony) to 3 km, and Duration (minimum trip duration) to 1 hour. ReturnBuff can be set further out in order to catch incomplete trips, where the animal began returning, but perhaps due to device failure the full trip wasn't captured.

Optionally, we can set rmColLocs to TRUE which will remove those points falling within the InnerBuff. In this case it is a good idea since we plan to calculate core ranges away from the colony.

trips <- tripSplit(
  tracks     = tracks, 
  Colony     = colony, 
  InnerBuff  = 3,      # kilometers
  ReturnBuff = 10, 
  Duration   = 1,      # hours
  plotit     = TRUE,   # visualize individual trips
  rmColLocs  = TRUE)
#> [1] "track 693041 does not return to the colony"
#> [1] "track 693434 does not return to the colony"

Then we can summarize the trip movements, using tripSummary.

tripSum <- tripSummary(Trips = trips, Colony = colony)

#> # A tibble: 215 x 10
#> # Groups:   ID [41]
#>    ID    trip_id n_locs departure           return              duration
#>    <chr> <chr>    <dbl> <dttm>              <dttm>                 <dbl>
#>  1 69302 693021     274 2012-07-22 07:52:11 2012-07-22 16:11:03     8.31
#>  2 69302 693022     124 2012-07-23 12:26:22 2012-07-23 15:54:05     3.46
#>  3 69302 693023     138 2012-07-25 08:30:53 2012-07-25 12:22:53     3.87
#>  4 69304 693041    1268 2013-08-22 11:50:41 2013-08-24 23:03:50    NA   
#>  5 69305 693051      71 2013-08-22 13:08:15 2013-08-22 15:10:59     2.05
#>  6 69306 693061      37 2014-01-06 16:28:42 2014-01-06 17:32:11     1.06
#>  7 69306 693062      83 2014-01-07 14:48:24 2014-01-07 17:10:21     2.37
#>  8 69306 693063     129 2014-01-08 14:25:11 2014-01-08 17:55:22     3.50
#>  9 69306 693064      50 2014-01-08 18:08:32 2014-01-08 19:30:40     1.37
#> 10 69306 693065     155 2014-01-09 14:47:04 2014-01-09 19:21:32     4.57
#> # ... with 205 more rows, and 4 more variables: total_dist <dbl>,
#> #   max_dist <dbl>, direction <dbl>, complete <chr>

Now that we have an idea how the animals are moving, we can start with the process of estimating their space use areas, and sites of aggregation!

findScale provides options for setting the all-important smoothing parameter in the Kernel Density Estimation. It calculates candidate smoothing parameters using several different methods.

If we know our animal uses an area-restricted search (ARS) strategy to locate prey, then we can set the ARSscale=TRUE. This uses First Passage Time analysis to identify the spatial scale at which area-restricted search is occuring, which may then be used as the smoothing parameter value.

Hvals <- findScale(trips,
  ARSscale      = TRUE,
  Colony        = colony,
  Trips_summary = tripSum)
#> [1] "No peak was found for: ID 69309"
#> [1] "No peak was found for: ID 69314"
#> [1] "No peak was found for: ID 69328"
#> [1] "No peak was found for: ID 69330"
#> [1] "No peak was found for: ID 69332"

#>   med_max_dist  mag scaled_mag href ARSscale
#> 1        24.73 3.21        7.7 7.51    20.76

The other values are more simplistic methods of calculating the smoothing parameter. href is the canonical method, and relates to the number of points in the data and their spatial variance. mag and scaled_mag are based on the average foraging range (med_max_dist in the tripSum output) estimated from the trips present in the data. These two methods only work for central-place foragers.

Then, we must select a smoothing parameter value. To inform our decision, we ought to use our understanding of the species' movement and foraging ecology to guide our decision about what scales make sense. That is, from the findScale output, we want to exclude values which we believe may under- or over-represent the area used by the animals while foraging.

Once we have chosen a smoothing value, we can produce Kernel Density Estimations for each individual, using estSpaceUse. By default this function isolates each animal's core range (i.e. the 50% utilization distribution, or where the animal spends about half of its time) which is a commonly used standard (Lascelles et al. 2016). However, this can easily be adjusted using the UDLev argument.

KDEs <- estSpaceUse(
  DataGroup = trips, 
  Scale = Hvals$mag, 
  UDLev = 50, 
  polyOut = TRUE)

At this step we should verify that the smoothing parameter value we selected is producing reasonable space use estimates, given what we know about our study animals. Are the core areas much larger than expected? Much smaller? If so, consider using a different value for the Scale parameter.

The next step is to estimate how representative this sample of animals is of the population. That is, how well does the variation in space use of these tracked individuals encapsulate variation in the wider population? To do this, we use the repAssess function. This function repeatedly samples a subset of individual core ranges, averages them together, and quantifies how many points from the unselected individuals fall within this combined core range area. This process is run across the range of the sample size, and iterated a chosen number of times.

To speed up this procedure, we can supply the output of estSpaceUse. We can choose the number of times we want to re-sample at each sample size by setting the Iteration argument. The higher the number the more confident we can be in the results, but the longer it will take to compute.

repr <- repAssess(trips, KDE = KDEs$KDE.Surface, Scale = Hvals$mag, Iteration = 1, BootTable = FALSE)
#> [1] "nls (non linear regression) successful, asymptote estimated for bootstrap sample."

The output is a dataframe, with the estimated percentage of representativeness given in the out column.

The relationship between sample size and the percent coverage of un-tested animals' space use areas (i.e. Inclusion) is visualized in the output plot seen below, which is automatically saved to the working directoty (i.e. getwd()) each time repAssess is run.

By quantifying this relationship, we can estimate how close we are to an information asymptote. Put another way, we have estimated how much new space use information would be added by tracking more animals. In the case of this seabird dataset, we estimate that ~98% of the core areas used by this population are captured by the sample of 39 individuals. Highly representative!

Now, using findKBA we can identify aggregation areas. Using the core area estimates of each individual we can calculate where they overlap. Then, we estimate the proportion of the larger population in a given area by adjusting our overlap estimate based on the degree of representativeness we've achieved.

Here, if we have population size estimates, we can include this value (using the Col.size argument) to estimate to output a number of individuals aggregating in space, which can then use to compare against importance criteria (i.e KBA, EBSA criteria). If we don't this will output a percentage of individuals instead.

If you desire polygon output of the overlap areas, instead of a gridded surface, you can indicate this using the polyOut argument. This aggregates all cells with the same estimated number/percentage of individuals into to single polygons.

KBAs <- findKBA(
  KDE = KDEs,
  Represent = repr$out,
  UDLev = 50,
  Col.size = 500,     # 500 seabirds breed one the island
  polyOut = TRUE,
  plotit = FALSE)     # we will plot in next step

#> [1] "sf"         "data.frame"

In findKBA we can specify plotit=TRUE if we want to visualize the result right away. However, there a numerous ways in which we might want to customize the output. The following are examples of code which can be used to visualize the two types of output from the findKBA function.

If we specified polyOut=TRUE, then the output will be in Simple Features format, and the data are spatial polygons. This allows us to easily take advantage of the ggplot2 plotting syntax to make an attractive map!

coordsets <- sf::st_bbox(KBAs)

KBAPLOT <- KBAs %>% dplyr::filter(.data$potentialKBA==TRUE) %>%
  ggplot() +
  geom_sf(mapping = aes(fill=N_animals, colour=N_animals)) +  # if not exporting to pdf, colour="transparent" works
  borders("world", fill="dark grey", colour="grey20") +       # plot basic land mass dataset from maps package
    xlim = c(coordsets$xmin, coordsets$xmax),
    ylim = c(coordsets$ymin, coordsets$ymax), expand = FALSE) +
    axis.text=element_text(size=14, colour="black"),
    panel.border = element_rect(colour = "black", fill=NA, size=1)) +
  guides(colour=FALSE) +
  scale_fill_continuous(name = "N animals") +
  ylab("Latitude") +

## we can easily add the colony location information for reference
KBAPLOT <- geom_point(data=colony, aes(x=Longitude, y=Latitude), col='red', shape=16, size=2)

## in case you want to save the plot
# ggplot2::ggsave("KBAPLOT", device="pdf")

This map shows the 'potential KBA' area; that is, the areas which are used by a significant proportion of the local population, given the representativeness of the sample of tracked individuals. In this case, since representativeness is >90%, any area used by 10% or more of the population is considered important (see Lascelles et al. 2016 for details).

Then, we can combine all the polygons within the 'potentialKBA' area, and using the maximum number of individuals present in that area we can assess whether it merits designation as a Key Biodiversity Area according to the KBA standard.

potKBA <- KBAs %>% dplyr::filter(.data$potentialKBA==TRUE) %>% 
     max_animals = max(na.omit(N_animals)), # maximum number of animals aggregating in the site
     min_animals = min(na.omit(N_animals))  # minimum number using the site

# plot(potKBA[1])

If in findKBA we instead specify polyOut=FALSE, our output will be a spatial grid of animal densities, with each cell representing the estimated number, or percentage of animals using that area. So this output is irrespective of the representativness-based importance threshold.

plot(KBA_sp[KBA_sp$N_animals > 0, ])

This plot shows the minimum estimated number of birds using the space around the breeding island.

You can’t perform that action at this time.