Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Type Detection #30

Closed
gridseth opened this issue Jun 23, 2016 · 6 comments
Closed

Type Detection #30

gridseth opened this issue Jun 23, 2016 · 6 comments

Comments

@gridseth
Copy link

Summary:

When parsing files I ran into problems with how paratext parses some of the elements in a file. I believe these may be related to the process_token() function in colbased_worker.hpp that tries to detect whether a token is an integer, float, exponential/scientific number or otherwise.

I have written code to make changes to this function, but was not able to create a pull request. Please let me know if there is a way I can create a pull request or provide the code change suggestions.

Examples:

It detects a string such as A.1 as a number and I get 0.000000
It detects a string such as 3ABC as a number and I get 3.000000

Details:

Let A.1 be the input. When reading the token and checking if the token is an integer (line 270) it checks if token_[i] is a digit. If not, we move on to see if we are dealing with a float instead. However we advance the index, i, regardless of whether the integer check passes or fails. Therefore, when we get to the float check on line 279 we are looking at the . character instead of A. Then the check for a float passes since we see . and only digits after it. Finally the result is 0.000000 since A.1 gets converted to a float before it is passed to process_float.

Numbers like 12.345 are not picked up as floats (because the integer check fails on . and we check for float on the next character, here 3), but instead as exponentials. They pass as exponentials not because they pass the exponential check, but because exp_possible is set as true at the beginning (on line 272 after the integer check passes on line 270) and does not become false. Both exponentials and floats are passed to process_float in the end. For the same reason 3ABC is detected as an exponential instead of a string and we get 3.000000. (Numbers like .123 are not detected as floats or exponentials because exp_possible is set as false after the integer check .)

Note:

When making updates to the process_token() function I did some simplifications, but did not change the behaviour of the function other than for the issues found.

In the code change suggestion I have let a number on the form 14e-3 be a valid exponential (compared to 14.0e-3). This can be changed if not desired.

@catchmrbharath
Copy link

@gridseth Do you need help in sending a pull request?

@gridseth
Copy link
Author

@catchmrbharath Thanks. I am getting some help now, so I'll be able to submit soon.

@catchmrbharath
Copy link

catchmrbharath commented Jun 23, 2016

  • Fork the repo
  • Add a remote to your repo git remote add develop "your_repo_url"
  • git push develop yourbranchname
  • Use the submit pull request button to send a pull request.

@gridseth
Copy link
Author

I have added the pull request, thanks for the help! #31

@deads
Copy link
Contributor

deads commented Jun 24, 2016

Thank you for the bug report and the proposed fix. I will look it over.

@deads
Copy link
Contributor

deads commented Feb 20, 2017

This issue should be resolved given the latest merged PRs. If not, please reopen. Thank you for your input and contributions.

@deads deads closed this as completed Feb 20, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants