Skip to content

Commit

Permalink
Merge pull request #2345 from quanteda/fix-docs
Browse files Browse the repository at this point in the history
Minor documentation fixes
  • Loading branch information
kbenoit committed Feb 13, 2024
2 parents ec96ed5 + 54a2689 commit 06f424b
Show file tree
Hide file tree
Showing 7 changed files with 33 additions and 25 deletions.
2 changes: 1 addition & 1 deletion R/data-documentation.R
Original file line number Diff line number Diff line change
Expand Up @@ -105,7 +105,7 @@
#' Dictionary*. Available at <https://www.snsoroka.com/data-lexicoder/>.
#'
#' Young, L. & Soroka, S. (2012). Affective News: The Automated Coding of
#' Sentiment in Political Texts]. \doi{10.1080/10584609.2012.671234}.
#' Sentiment in Political Texts. \doi{10.1080/10584609.2012.671234}.
#' *Political Communication*, 29(2), 205--231.
#' @keywords data
#' @examples
Expand Down
5 changes: 3 additions & 2 deletions R/quanteda-documentation.R
Original file line number Diff line number Diff line change
Expand Up @@ -105,7 +105,7 @@
#' (literal) pattern matching.} }
#' @note If "fixed" is used with `case_insensitive = TRUE`, features will
#' typically be lowercased internally prior to matching. Also, glob matches
#' are converted to regular expressions (using [glob2rx][utils::glob2rx]) when
#' are converted to regular expressions (using [utils::glob2rx()]) when
#' they contain wild card characters, and to fixed pattern matches when they
#' do not.
#' @name valuetype
Expand All @@ -124,7 +124,7 @@ NULL
#' or collocations object. See [pattern] for details.
#' @details The `pattern` argument is a vector of patterns, including
#' sequences, to match in a target object, whose match type is specified by
#' [valuetype()]. Note that an empty pattern (`""`) will match
#' [valuetype]. Note that an empty pattern (`""`) will match
#' "padding" in a [tokens] object.
#' \describe{
#' \item{`character`}{A character vector of token patterns to be selected
Expand Down Expand Up @@ -172,6 +172,7 @@ NULL
#' (dict1 <- dictionary(list(us = c("president", "white house", "house of representatives"))))
#' phrase(dict1)
#' @keywords internal
#' @seealso [valuetype], [case_insensitive]
NULL

#' Grouping variable(s) for various functions
Expand Down
20 changes: 11 additions & 9 deletions R/tokens.R
Original file line number Diff line number Diff line change
Expand Up @@ -2,19 +2,21 @@

#' Construct a tokens object
#'
#' Construct a tokens object, either by importing a named list of characters
#' from an external tokenizer, or by calling the internal \pkg{quanteda}
#' tokenizer.
#' @description Construct a tokens object, either by importing a named list of
#' characters from an external tokenizer, or by calling the internal
#' \pkg{quanteda} tokenizer.
#'
#' `tokens()` works on tokens class objects, which means that the removal rules
#' can be applied post-tokenization, although it should be noted that it will
#' not be possible to remove things that are not present. For instance, if the
#' `tokens` object has already had punctuation removed, then `tokens(x,
#' remove_punct = TRUE)` will have no additional effect.
#' @description `tokens()` can also be applied to tokens class objects, which
#' means that the removal rules can be applied post-tokenization, although it
#' should be noted that it will not be possible to remove things that are not
#' present. For instance, if the `tokens` object has already had punctuation
#' removed, then `tokens(x, remove_punct = TRUE)` will have no additional
#' effect.
#' @param x the input object to the tokens constructor; a [tokens], [corpus] or
#' [character] object to tokenize.
#' @param what character; which tokenizer to use. The default `what = "word"`
#' is the version 2 \pkg{quanteda} tokenizer. Legacy tokenizers (version < 2)
#' is the current version of the \pkg{quanteda} tokenizer, set by
#' `quanteda_options(okens_tokenizer_word)`. Legacy tokenizers (version < 2)
#' are also supported, including the default `what = "word1"`. See the Details
#' and quanteda Tokenizers below.
#' @param remove_punct logical; if `TRUE` remove all characters in the Unicode
Expand Down
2 changes: 1 addition & 1 deletion man/data_dictionary_LSD2015.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

5 changes: 4 additions & 1 deletion man/pattern.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

22 changes: 12 additions & 10 deletions man/tokens.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion man/valuetype.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

0 comments on commit 06f424b

Please sign in to comment.