version 1.0.1

cran · Mar 3, 2022 · c8ad573 · c8ad573
1 parent 87e85f3
commit c8ad573
Show file tree

Hide file tree

Showing 6 changed files with 25 additions and 20 deletions.
diff --git a/DESCRIPTION b/DESCRIPTION
@@ -1,6 +1,6 @@
 Package: piecemaker
 Title: Tools for Preparing Text for Tokenizers
-Version: 1.0.0
+Version: 1.0.1
 Authors@R: c(
     person(given = "Jon",
            family = "Harmon",
@@ -20,18 +20,18 @@ Description: Tokenizers break text into pieces that are more usable by machine
     provides those shared steps, along with a simple tokenizer.
 License: Apache License (>= 2)
 Encoding: UTF-8
-RoxygenNote: 7.1.1
+RoxygenNote: 7.1.2
 URL: https://github.com/macmillancontentscience/piecemaker
 BugReports: https://github.com/macmillancontentscience/piecemaker/issues
 Suggests: testthat (>= 3.0.0)
 Config/testthat/edition: 3
-Imports: purrr, rlang (>= 0.4.2), stringi, stringr
+Imports: rlang (>= 0.4.2), stringi, stringr
 Depends: R (>= 2.10)
 NeedsCompilation: no
-Packaged: 2021-08-05 22:01:09 UTC; jonth
+Packaged: 2022-03-03 14:07:56 UTC; jonth
 Author: Jon Harmon [aut, cre] (<https://orcid.org/0000-0003-4781-4346>),
   Jonathan Bratt [aut] (<https://orcid.org/0000-0003-2859-0076>),
   Bedford Freeman & Worth Pub Grp LLC DBA Macmillan Learning [cph]
 Maintainer: Jon Harmon <jonthegeek@gmail.com>
 Repository: CRAN
-Date/Publication: 2021-08-06 17:50:06 UTC
+Date/Publication: 2022-03-03 15:50:06 UTC
diff --git a/MD5 b/MD5
@@ -1,12 +1,12 @@
-c767b88edf1431c320ec308ea028608a *DESCRIPTION
+59bb7c51f35c4f5f909f31c120778ca5 *DESCRIPTION
 e40f9d15973f27dbd85a5b7ca13062bc *NAMESPACE
-85e3bf16169091c7126a39899d22a92d *NEWS.md
-abdfe1a5b353519ce58aff60629ce4a5 *R/clean.R
+43c8031e43e1d30c6aea95555d0b037d *NEWS.md
+a5522e64e6dba23da3330330350b9f0f *R/clean.R
 e8d211b2576bb8953aacca4f2e7ec7b6 *R/space.R
 006f1b514ff3af4d7eb082ac04fcdf9d *R/sysdata.rda
 90253c7c4d2b4c4f737c6911c9b7529a *R/tokenize.R
-dc7be9461a60c95c57c324edbd9128f0 *README.md
-676f3051082f94d86c8d113210375891 *man/dot-coerce_to_utf8.Rd
+4a53ef5c81afd882764ca44014b3d7aa *README.md
+47b6424eda6652547190bc6b29e44327 *man/dot-coerce_to_utf8.Rd
 defea65ac9f9add858e4ec26d36bf1ba *man/dot-make_unicode_block_regex.Rd
 a15d6972ad34dad2855d96f4333a91f6 *man/dot-space_regex_selector.Rd
 d92843f3e86ec6583db3d96b020d5b2e *man/prepare_and_tokenize.Rd

diff --git a/NEWS.md b/NEWS.md
@@ -1,3 +1,7 @@
+# piecemaker 1.0.1
+
+* Removed purrr dependency.
+
 # piecemaker 1.0.0
 
 * Added a `NEWS.md` file to track changes to the package.

diff --git a/R/clean.R b/R/clean.R
@@ -43,17 +43,19 @@ validate_utf8 <- function(text) {
   Encoding(text[in_encoding_status]) <- "UTF-8"
 
   # Now try to coerce the leftovers to UTF-8.
-  text[!in_encoding_status] <- purrr::map_chr(
-    text[!in_encoding_status],
-    .coerce_to_utf8
+  text[!in_encoding_status] <- vapply(
+    X = text[!in_encoding_status],
+    FUN = .coerce_to_utf8,
+    FUN.VALUE = character(1),
+    USE.NAMES = FALSE
   )
 
   return(text)
 }
 
 #' Coerce to UTF8
 #'
-#' @param this_text Character scalar; a piece of text to attemp to coerce.
+#' @param this_text Character scalar; a piece of text to attempt to coerce.
 #'
 #' @return The text as UTF8.
 #' @keywords internal

diff --git a/README.md b/README.md
@@ -11,19 +11,18 @@ experimental](https://img.shields.io/badge/lifecycle-experimental-orange.svg)](h
 
 Tokenizers break text into pieces that are more usable by machine
 learning models. While writing
-[wordpiece](https://github.com/jonathanbratt/wordpiece) and
+[wordpiece](https://github.com/macmillancontentscience/wordpiece) and
 [morphemepiece](https://github.com/macmillancontentscience/morphemepiece),
-we found that many steps were shared between those package. This package
-provides those shared steps.
+we found that many steps were shared between those packages. This
+package provides those shared steps.
 
 ## Installation
 
 You can install the released version of piecemaker from
 [CRAN](https://CRAN.R-project.org) with:
 
 ``` r
-# Not yet.
-#install.packages("piecemaker")
+install.packages("piecemaker")
 ```
 
 And the development version from [GitHub](https://github.com/) with:

diff --git a/man/dot-coerce_to_utf8.Rd b/man/dot-coerce_to_utf8.Rd