Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorrectly reads # chunks #13

Closed
DataStrategist opened this issue Mar 6, 2017 · 0 comments
Closed

Incorrectly reads # chunks #13

DataStrategist opened this issue Mar 6, 2017 · 0 comments

Comments

@DataStrategist
Copy link
Owner

We should not read chunks that are behind a #, but for now it does. The closest I got was this:
str_extract_all(b, "\\#{0,4} *[a-zA-Z0-9\\._]+ *\\<\\- *[a-zA-Z0-9\\._]+?\\(\\{.+?\\}\\)", simplify = F) %>% .[[1]]

But it reads the ## from the line above and is therefore useless. I think the only way to resolve this is to use another method to import the files than read_file.... cause read_file gets rid of newspaces, which makes it impossible to figure out if a hashtag is at the beginning of the sentence or not.

DataStrategist added a commit that referenced this issue Mar 6, 2017
…ixed a bug in Chunking

This adds functionality for #12

Re: bug... check #13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant