# Statistics

CanoPyHydro provides access to a wealth of data regarding tree canopies. 

To make these statistics more accessable, they may be calculated in bulk via the 'statistics' function. \
The function is called below, and definitions for the statistics can be found in the Glossary. 

In [None]:
# This function 
import os

os.environ["CANOPYHYDRO_CONFIG"] = "./canopyhydro_config.toml"
from canopyhydro.CylinderCollection import CylinderCollection

# Initializing a CylinderCollection object
myCollection = CylinderCollection()

# Converting a specified file to a CylinderCollection object
myCollection.from_csv("5_SmallTree.csv")

# Requesting an plot of the tree projected onto the XY plane (birds-eye view)
myCollection.project_cylinders("XY")

# creating the digraph model
myCollection.initialize_digraph_from()

stat_file = myCollection.statistics()

# Will generate a file with all of the statisics listed below 
# total_psa
# psa_w_overlap
# stem_psa
# stem_psa_w_overlap
# tot_surface_area
# stem_surface_area
# tot_hull_area
# tot_hull_boundary
# stem_hull_area

# stem_hull_boundary
# num_drip_points
# max_bo
# topQuarterTotPsa
# topHalfTotPsa
# topThreeQuarterTotPsa
# TotalShade
# top_quarter_shade
# top_half_shade
# top_three_quarter_shade
# DBH
# volume
# X_max
# Y_max
# Z_max
# X_min
# Y_min
# Z_min
# Order_zero_angle_avg
# Order_zero_angle_std
# Order_one_angle_avg
# Order_one_angle_std
# Order_two_angle_avg
# Order_two_angle_std
# Order_three_angle_avg
# Order_three_angle_std
# order_gr_four_angle_avg
# order_gr_four_angle_std

In addition to this bulk function, some individual statistics can be found using a variety of dedicated functions. A few such funcitons are shown below.

In [None]:
# Identifying flow information (covered later)
myCollection.find_flow_components()
myCollection.calculate_flows(plane=plane)

# Identifying the overlap of branches at various heights/depths in the canopy
myCollection.find_overlap_by_percentile(plane = ['XY'])
myCollection.find_overlap_by_percentile(plane = ['XY'],percentiles=[33,66,99])
myCollection.find_overlap_by_percentile(plane = ['XZ'],percentiles=[10,20,30,70])

# Diameter at breast height
myCollection.get_dbh()

# Angle of the theoretical line from tip
#  of trunk to base of trunk
myCollection.find_trunk_lean()

Though many statistics do not have functions used to calculate them directly. Many are readily accessible. \
Browsing through the larger 'statistics' function and the 'calculate_flows_function' can show users methods for finding a variety of summary statistics. 

In [None]:
# Finding the total projected area 
# from 'statistics'
import numpy as np
from canopyhydro.geometry import unary_union
twod_polys = myCollection.pSV
projection_of_all_branches = unary_union(twod_polys)
projected_area_without_overlap = projection_of_all_branches.area

# Finding the sum of all cylinder projected areas
xy_projected_area= np.sum([cyl.projected_data['XY']["area"]
                                for cyl in myCollection.cylinders]
                        )
xz_projected_area= np.sum([cyl.projected_data['XZ']["area"]
                                for cyl in myCollection.cylinders]
                        )
total_volume= np.sum([cyl.volume
                                for cyl in myCollection.cylinders]
                        )

# Flow Statistics

For additional details on finding flow attributes, see [Flow Identification](flow_identification_drawing.ipynb). \
The main output of the calculate flows function is the 'flows' attribute, which is added to myCollection and is populated with flow statistics (See below).

In [None]:
# printing the first 20 flows found 
myCollection.flows[0:20]


The first element of the above array is always stemflow. The other flows are all drip flows (those that contribute to through fall) and are listed in no particular order. \
Note also that each flow has a 'drip_node_loc' (drip node location) listed for each flow. This attribute refers to the x,y and z coordinates of the point at which a flow drops off of a branch to the ground. 

In addition to defining 'myCollection.flows', the above functions set the attribute 'is_stem' for each cylinder. \
By using 'is_stem', we can isolate the stemflow generating portion of the tree. 

In [None]:
whole_tree_hull,_ = myCollection.watershed_boundary(
    plane="XY",
    curvature_alpha=0.15,
    draw=True,
)

# plotting the boundary of the stemflow generating portion alone
stem_flow_hull,_ = myCollection.watershed_boundary(
    plane="XY",
    curvature_alpha=0.15,
    filter_lambda=lambda: is_stem ,
    draw=True,
)

In doing so, we can calculate statistics for only the subset of cylinders that contribute to stemflow.

In [None]:
# Demonstrating how statistics can be generated from hul objects
print(type(stem_flow_hull))

print('Whole Tree Coverage Area:')
print(whole_tree_hull.area)

print('Stem Flow Coverage Area:')
print(stem_flow_hull.area)


print('')

print('Whole Tree Coverage Bounds:')
print(whole_tree_hull.bounds)

print('Stem Flow Coverage Bounds:')
print(stem_flow_hull.bounds)

print('')

# Demonstrating how is_stem might be used in other calculations 
total_volume= np.sum([cyl.volume
                                for cyl in myCollection.cylinders
                                if cyl.is_stem]
                        )

For detailed definitions of all of the statistics availible to users, see the [Glossary](../glossary.rst)