From e02235ca8b3d29fec672b6fcd5a3ac9b31666127 Mon Sep 17 00:00:00 2001 From: Erin Becker Date: Tue, 28 Mar 2017 11:40:23 -0700 Subject: [PATCH] remove redundant language about clustering Addresses https://github.com/datacarpentry/OpenRefine-ecology-lesson/issues/66 --- episodes/01-working-with-openrefine.md | 6 +----- 1 file changed, 1 insertion(+), 5 deletions(-) diff --git a/episodes/01-working-with-openrefine.md b/episodes/01-working-with-openrefine.md index 558cbcdf..d00d9d6d 100644 --- a/episodes/01-working-with-openrefine.md +++ b/episodes/01-working-with-openrefine.md @@ -91,11 +91,7 @@ and how many times that value occurs in the column. ## Cluster -In OpenRefine, clustering means "finding groups of different values that might be alternative representations of the same thing". For example, the two strings "New York" and "new york" are very likely to refer to the same concept and just have capitalization differences. Likewise, "Gödel" and "Godel" probably refer to the same person. Clustering is a very powerful tool for cleaning datasets which contain misspelled or mistyped entries. -OpenRefine has several clustering algorithms built in. Experiment with them, and learn more about these algorithms and how they work. - -In OpenRefine, clustering refers to the operation of "finding groups of different values that might be alternative representations of the same thing". For example, the two strings "New York" and "new york" are very likely to refer to the same concept and just have capitalization differences. Likewise, "Gödel" and "Godel" probably refer to the same person. - +In OpenRefine, clustering means "finding groups of different values that might be alternative representations of the same thing". For example, the two strings "New York" and "new york" are very likely to refer to the same concept and just have capitalization differences. Likewise, "Gödel" and "Godel" probably refer to the same person. Clustering is a very powerful tool for cleaning datasets which contain misspelled or mistyped entries. OpenRefine has several clustering algorithms built in. Experiment with them, and learn more about these algorithms and how they work. > - In the scientificName Text Facet we created in the step above, click the _Cluster_ button. > - In the resulting pop-up window, you can change the Method and the Keying Function. Try different combinations to