# Data Wrangling in R (part 2)

## Table of Contents

- [Reshaping Data with tidyr](#resh)


---
<a id='resh'></a>

## Reshaping Data with tidyr

One of the most common data wrangling challenges is adjusting how exactly rows and columns are used to represent data. [**tidyr ("tidy-er")**](https://tidyr.tidyverse.org/) is a package helping in structuring data frames to have the desired shape (transforming orientation) for visualization, running a statistical model or implementing a machine learning algorithm. **tidyr** helps in following the **principles of tidy data**. Tidy data is data where:

- Every column is variable.
- Every row is an observation.
- Every cell is a single value.


<img src="images/data-wide.png" alt="" style="width: 500px;"/>

The format is wide, because the price data is spread wide across multiple columns.

<img src="images/data-long.png" alt="" style="width: 500px;"/>

The format is long, because the price data has ist own column. This format includes duplicated cities and bands.

In [None]:
# gather() - to move from wide format to long format 
#  you need to gather all of the prices into a single columns

# Reshape by gathering prices into a single feature
band_data_long <- gather(
    band_data_wide, # data frame to gather from
    # name for new column listing the gathered features 
    # (will contain values of column names from the wide form)
    key = band, 
    # name for new column listing the gathered values 
    # (here will be all gathered values)
    value = price, 
    # columns to gather data from
    # (gather from all columns except city)
    -city 
)

<img src="images/data-tidyr-gather.png" alt="" style="width: 600px;"/>


In [None]:
# spread() - from rows to columns, from long into wide format

# Reshape by spreading prices out among multiple features
price_by_band <- spread(
    band_data_long, # data frame to spread from
    key = city, # get new colum names from this column
    value = price # get values for the new columns from this column
)

<img src="images/data-tidyr-spread.png" alt="" style="width: 600px;"/>


In [None]:
# Unite multiple columns into a single column
# unite()

# Separate a single column into multiple columns
# separate()

In [None]:
---
<a id='data'></a>