functionsnashid

During the data exploration phase, developers write repeated code to investigate the summary view based on different categories. The goal of this package is to avoid writing boilerplate code during the data exploration phase. This package implements counting the number of observations per category in a given dataset and returns the top observations.

Installation

This package is not in the CRAN yet. You can install the development version of functionsnashid from the GitHub repository with:

devtools::install_github("stat545ubc-2021/functionsnashid")

Basic Example

Please check ?count_by_category for a more detailed explanation of the function. Now we demonstrate the basic usage of the function. In the following example, we get the number of games per genre from the steam_games dataset.

Results in descending order by default:

suppressMessages(library(tidyverse))
suppressMessages(library(datateachr))
library(functionsnashid)

games <- steam_games %>%
  select(id, name, genre, publisher, developer, original_price, release_date, all_reviews) %>%
  separate_rows(genre, sep = ",", convert = TRUE)

count_by_category(steam_games, genre, 5)
#> # A tibble: 5 × 2
#>   genre                  count
#>   <chr>                  <int>
#> 1 Action                  2386
#> 2 Action,Indie            2129
#> 3 Casual,Indie            1732
#> 4 Action,Adventure,Indie  1585
#> 5 Adventure,Indie         1520

Results in ascending order:

count_by_category(steam_games, genre, 5, FALSE)
#> # A tibble: 5 × 2
#>   genre                                                                    count
#>   <chr>                                                                    <int>
#> 1 Accounting,Animation & Modeling,Audio Production,Design & Illustration,…     1
#> 2 Accounting,Education,Software Training,Utilities,Early Access                1
#> 3 Action,Adventure,Casual,Early Access                                         1
#> 4 Action,Adventure,Casual,Free to Play                                         1
#> 5 Action,Adventure,Casual,Free to Play,Early Access                            1

More Examples with Different Datasets

Here we would demonstrate the usage of the function count_by_category to explore different dataset:

Get the count of trees per genus in the `vancouver_trees` dataset.

We see Acer genus i.e. family of Maple trees are the most common in vancouver.

count_by_category(vancouver_trees, genus_name, 5)
#> # A tibble: 5 × 2
#>   genus_name count
#>   <chr>      <int>
#> 1 ACER       36062
#> 2 PRUNUS     30683
#> 3 FRAXINUS    7381
#> 4 TILIA       6773
#> 5 QUERCUS     6119

Get the count of apartment buildings per property type in the `apt_buildings` dataset.

count_by_category(apt_buildings, property_type, 5)
#> # A tibble: 3 × 2
#>   property_type  count
#>   <chr>          <int>
#> 1 PRIVATE         2888
#> 2 TCHC             327
#> 3 SOCIAL HOUSING   240

What `heating_types` are common in in the `apt_buildings` dataset?

count_by_category(apt_buildings, heating_type, 5)
#> # A tibble: 3 × 2
#>   heating_type   count
#>   <chr>          <int>
#> 1 HOT WATER       2789
#> 2 FORCED AIR GAS   315
#> 3 ELECTRIC         265

Name		Name	Last commit message	Last commit date
Latest commit History 29 Commits
R		R
man		man
tests		tests
.Rbuildignore		.Rbuildignore
.gitignore		.gitignore
DESCRIPTION		DESCRIPTION
LICENSE		LICENSE
LICENSE.md		LICENSE.md
NAMESPACE		NAMESPACE
NEWS.md		NEWS.md
README.Rmd		README.Rmd
README.md		README.md
functionsnashid.Rproj		functionsnashid.Rproj

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Licenses found

Repository files navigation

functionsnashid

Installation

Basic Example

More Examples with Different Datasets

Get the count of trees per genus in the `vancouver_trees` dataset.

Get the count of apartment buildings per property type in the `apt_buildings` dataset.

What `heating_types` are common in in the `apt_buildings` dataset?

About

Licenses found

Releases 1

Packages

Languages

License

Licenses found

stat545ubc-2021/functionsnashid

Folders and files

Latest commit

History

Repository files navigation

functionsnashid

Installation

Basic Example

More Examples with Different Datasets

Get the count of trees per genus in the vancouver_trees dataset.

Get the count of apartment buildings per property type in the apt_buildings dataset.

What heating_types are common in in the apt_buildings dataset?

About

Topics

Resources

License

Licenses found

Stars

Watchers

Forks

Releases 1

Packages 0

Languages

Get the count of trees per genus in the `vancouver_trees` dataset.

Get the count of apartment buildings per property type in the `apt_buildings` dataset.

What `heating_types` are common in in the `apt_buildings` dataset?

Packages