In [2]:
library(tidyverse)

# Re-convert character columns in existing data frame

This is useful if you need to do some manual munging - you can read the columns in as character, clean it up with (e.g.) regular expressions and then let readr take another stab at parsing it. The name is a homage to the base

```r
type_convert(
  df,
  col_types = NULL,
  na = c("", "NA"),
  trim_ws = TRUE,
  locale = default_locale()
)
```

# Note

**`type_convert()`** removes a 'spec' attribute, because it likely modifies the column data types. 

# Examples

Sometimes it’s easier to diagnose problems if you just read in all the columns as character vectors:

In [3]:
challenge2 <- read_csv(readr_example('challenge.csv'), col_types = cols(.default = 'c'))

challenge2 %>% glimpse()

Rows: 2,000
Columns: 2
$ x <chr> "404", "4172", "3004", "787", "37", "2332", "2489", "1449", "3665...
$ y <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N...


This is particularly useful in conjunction with **`type_convert()`**, which applies the parsing heuristics to the character columns in a data frame.

In [4]:
df <- tribble(
  ~x,  ~y,
  "1", "1.21",
  "2", "2.32",
  "3", "4.56"
)

df %>% glimpse()

Rows: 3
Columns: 2
$ x <chr> "1", "2", "3"
$ y <chr> "1.21", "2.32", "4.56"


In [5]:
df1 <- df %>% type_convert()
df1 %>% glimpse()


-- Column specification ------------------------------------------------------------------------------------------------
cols(
  x = col_double(),
  y = col_double()
)



Rows: 3
Columns: 2
$ x <dbl> 1, 2, 3
$ y <dbl> 1.21, 2.32, 4.56


---

In [4]:
# set the datatype of all columnns to character
data <- read_csv(readr_example("mtcars.csv"),
                 col_types = cols(.default = col_character()))
str(data)

tibble [32 x 11] (S3: spec_tbl_df/tbl_df/tbl/data.frame)
 $ mpg : chr [1:32] "21" "21" "22.8" "21.4" ...
 $ cyl : chr [1:32] "6" "6" "4" "6" ...
 $ disp: chr [1:32] "160" "160" "108" "258" ...
 $ hp  : chr [1:32] "110" "110" "93" "110" ...
 $ drat: chr [1:32] "3.9" "3.9" "3.85" "3.08" ...
 $ wt  : chr [1:32] "2.62" "2.875" "2.32" "3.215" ...
 $ qsec: chr [1:32] "16.46" "17.02" "18.61" "19.44" ...
 $ vs  : chr [1:32] "0" "0" "1" "1" ...
 $ am  : chr [1:32] "1" "1" "1" "0" ...
 $ gear: chr [1:32] "4" "4" "4" "3" ...
 $ carb: chr [1:32] "4" "4" "1" "1" ...
 - attr(*, "spec")=
  .. cols(
  ..   .default = col_character(),
  ..   mpg = col_character(),
  ..   cyl = col_character(),
  ..   disp = col_character(),
  ..   hp = col_character(),
  ..   drat = col_character(),
  ..   wt = col_character(),
  ..   qsec = col_character(),
  ..   vs = col_character(),
  ..   am = col_character(),
  ..   gear = col_character(),
  ..   carb = col_character()
  .. )


In [5]:
# use type_convert to parse the datatype for each column 
type_convert(data)


-- Column specification ------------------------------------------------------------------------------------------------
cols(
  mpg = col_double(),
  cyl = col_double(),
  disp = col_double(),
  hp = col_double(),
  drat = col_double(),
  wt = col_double(),
  qsec = col_double(),
  vs = col_double(),
  am = col_double(),
  gear = col_double(),
  carb = col_double()
)



mpg,cyl,disp,hp,drat,wt,qsec,vs,am,gear,carb
21.0,6,160.0,110,3.9,2.62,16.46,0,1,4,4
21.0,6,160.0,110,3.9,2.875,17.02,0,1,4,4
22.8,4,108.0,93,3.85,2.32,18.61,1,1,4,1
21.4,6,258.0,110,3.08,3.215,19.44,1,0,3,1
18.7,8,360.0,175,3.15,3.44,17.02,0,0,3,2
18.1,6,225.0,105,2.76,3.46,20.22,1,0,3,1
14.3,8,360.0,245,3.21,3.57,15.84,0,0,3,4
24.4,4,146.7,62,3.69,3.19,20.0,1,0,4,2
22.8,4,140.8,95,3.92,3.15,22.9,1,0,4,2
19.2,6,167.6,123,3.92,3.44,18.3,1,0,4,4
