Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use pivot_longer() and pivot_wider() instead of gather() and spread() #570

Closed
matt-dray opened this issue Oct 10, 2019 · 3 comments · Fixed by #583
Closed

Use pivot_longer() and pivot_wider() instead of gather() and spread() #570

matt-dray opened this issue Oct 10, 2019 · 3 comments · Fixed by #583

Comments

@matt-dray
Copy link
Contributor

Lesson: Dataframe Manipulation with tidyr.

Suggestion: replace spread() and gather() with pivot_longer() and pivot_wider().

Reason: these new, more intuitive functions are intended to be the de facto approach to making data 'longer' and 'wider' as of {tidyr} version 1.0.0 (though gather() and spread() continue to work).

@matt-dray
Copy link
Contributor Author

I've had a look at doing this in a branch of a fork of the repo (link valid at time of posting).

Looks like outputs from the pivot functions differ from gather() and spread(): they have tibble class as well as data.frame and are ordered differently. This means some of the figures need updating as well as the text.

For example, consider the gap_long object from the 'From wide to long format with gather()' section, which was created using gather(). str(gap_long) gives:

'data.frame':	5112 obs. of  4 variables:
 $ continent   : chr  "Africa" "Africa" "Africa" "Africa" ...
 $ country     : chr  "Algeria" "Angola" "Benin" "Botswana" ...
 $ obstype_year: chr  "pop_1952" "pop_1952" "pop_1952" "pop_1952" ...
 $ obs_values  : num  9279525 4232095 1738315 442308 4469979 ...

If the same code is re-run with pivot_longer() instead of gather(), the output of str(gap_long) looks like this:

Classes 'tbl_df', 'tbl' and 'data.frame':    5112 obs. of  4 variables:
$ continent   : chr  "Africa" "Africa" "Africa" "Africa" ...
$ country     : chr  "Algeria" "Algeria" "Algeria" "Algeria" ...
$ obstype_year: chr  "pop_1952" "pop_1957" "pop_1962" "pop_1967" ...
$ obs_values  : num  9279525 10270856 11000948 12760499 14760787 ...

So pivot_longer() yielded the tibble class and was ordered by continent-country.

@jcoliver
Copy link
Contributor

Hmmm...interesting. Any reason why the ordering in the output is different between gather (continent, obstype_year, country) and pivot_longer (continent, country, obstype_year)?

@katrinleinweber
Copy link
Contributor

Related: datacarpentry/r-socialsci#104

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants