-
Notifications
You must be signed in to change notification settings - Fork 1
/
slothML_script.R
114 lines (49 loc) · 3.16 KB
/
slothML_script.R
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
#Script for machine learning with sloth skull data
# SUPERVISED LEARNING -----------------------------------------------------
# Load the caret package
# Import the dataset sloth_landmarks.csv (in the lesson_plans/machine_learning folder)
# Make sure to import "From Text (base)"
# View the data
# Exploratory data analysis -----------------------------------------------
# Refer to your code from intro_machinelearning_supervised.Rmd
# There are A LOT of variables in this dataset, so feel free to explore whatever you think looks most interesting
# Spend a little time on this looking at variables you think might be interesting later
# Use the featurePlot() function to look at the variables (you won't be able to see them all at once)
# You can color the plots based on species or genus to compare Choloepus to Bradypus
# Validation & training sets ----------------------------------------------
# Refer to your previous code to split your dataset into validation and training sets
# Run algorithms -----------------------------------------------------
# Set up the variables "control" and "metric" in order to run the models
# Build five types of supervised learning models (following the code for the iris data)
# Compare the results of the models
# Which model(s) have high accuracy?
# Make predictions --------------------------------------------------------
# Run all five models on the validation data using the predict() function
# View the confusion matrices for the models
# Interpret your results: which models were most accurate?
# Was it easier to classify some species than others?
# Other approaches --------------------------------------------------------
# Now it's time to explore!
# Try adapting your machine learning script above to addreeess a different classification question
# For example: classifying based on genus only (Bradypus vs. Choloepus)
# Picking a genus and jsut running the models within that genus
# Picking only species with a certain amount of data points
# UNSUPERVISED LEARNING ---------------------------------------------------
# Load the cluster package
# K-means clustering ------------------------------------------------------
# Use your code from intro_machinelearning_unsupervised.Rmd
# Perform k-means clustering - think about how many clusters you want
# Are you trying to cluster specimens into species or genuses?
# Check how the clusters compare to the specimen identities (either genus or species)
# Hierarchical clustering -------------------------------------------------
# Scale the landmark variables you plan to use in your analyses
# Use hclust() to perform agglomerative hierarchical clustering
# Visualize the results
# Use diana() to perform divisive hierarchical clustering
# Visualize the results
# Bonus hierarchical clustering -------------------------------------------
# As a bonus, you can compare the agglomerative and divisive results
# Install and load the dendextend package
# Use the function tanglegram() to compare the two dendrograms you made with hclust() and diana()
# If you get stumped, use the help menu
# You can also look on this website: https://uc-r.github.io/hc_clustering#dendro