Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Possible issues with equal signs in locals #3

Closed
sergiocorreia opened this issue Jul 25, 2017 · 4 comments
Closed

Possible issues with equal signs in locals #3

sergiocorreia opened this issue Jul 25, 2017 · 4 comments
Assignees

Comments

@sergiocorreia
Copy link

This mostly applies for older versions of Stata (e.g. Stata 12), but in general it's risky to have lines such as

local xyz = substr(..)

Because the equal sign truncates the local. A good explanation of why is here: http://www.stata-journal.com/sjpdf.html?articlenum=pr0045

@mcaceresb
Copy link
Owner

mcaceresb commented Jul 26, 2017

I have no way of checking, since I don't have older Stata versions, but if I just use the extended macro subinstr, will that fix any potential issues? I would assume so, since the solution there is to use extended macro functions, but I can't actually check. I mean, on my machine subinstr handled a string with 10,000 characters just fine.

I generally prefer to have the right hand evaluate so this is very much intentional, but I hadn't realized older versions might have an issue with it.

@sergiocorreia
Copy link
Author

That's correct, replacing all cases of local ... = with their extended macro counterparts will fix the issue. That's something I had to do for reghdfe after a few bug reports:

sergiocorreia/reghdfe@7c87214

That said, my advice would be to leave it in the backburner. Much more useful would be MP support for Windows, OSX support, etc., as the overlap between Stata 12 and gtools is probably low.

@mcaceresb
Copy link
Owner

Sure, but it's also low-hanging fruit. I am also working on multi-threaded support for Windows but it's proving a bit tricky. As for OSX, a port is on hold until I can get someone to compile on Apple hardware for me. I tried and failed to set up a virtual machine with OSX, so I don't have an ETA for that anymore.

@mcaceresb mcaceresb self-assigned this Jul 26, 2017
mcaceresb added a commit that referenced this issue Jul 27, 2017
Enhancements

* Addressed the possible issue noted in issue
  #3 and the functions now
  use mata and extended macro functions as applicable.

Bug fixes

* `gegen varname = tag(varlist)` no longer tags missing values, as noted
  in issue #5
mcaceresb added a commit that referenced this issue Jul 27, 2017
gtools-0.6.5 through gtools-0.6.9

Enhancements

* Addressed the possible issue noted in issue
  #3 and the functions now
  use mata and extended macro functions as applicable.
* `gegen varname = group(varlist)` no longer has holes, as noted in issue
  #4
* `gegen` and `gcollapse` fall back on `collapse` and `egen` in case there
  is a collision. Future releases will implement an internal way to resolve
  collisions. This is not a huge concern, as SpookyHash has no known
  vulnerabilities (I believe the concern raied in issue #2
  was base on a typo; see [here](rurban/smhasher#34))
  and the probability of a collision is very low.
* `gegen varname = group(varlist)` now has a consistency test (though
  the group IDs are not the same as `egen`'s, they should map to the `egen`
  group IDs 1 to 1, which is what the tests now check for).
* The function now checks numerical variabes to see if they are integers.
  Working with integers is faster than hashing.
* The function is now smarter about generating targets. In prior versions,
  when the target statistic was a sum the function would force the target
  type to be `double`. Now if the source already exists and is a float, the
  function now checks if the resultimg sum would overflow. It will only
  recast the source as double for collapsing if the sum might overflow, that
  is, if `_N * min < -10^38` or `10^38 < _N * max` (note +/- 10^38 are the
  largest/smallest floats stata can represent; see `help data_types`).

Bug fixes

* `gegen` no longer ignores unavailable options, as noted in issue
  #4, and now it throws an error.
* `gegen varname = tag(varlist)` no longer tags missing values, as noted
  in issue #5
* Additional fixes for issue #1
* Apparentlly the argument Stata passes to plugins have a maximum length. The
  code now makes sure chuncks are passed when the PATH length will exceed the
  maximum. The plugin later concatenates the chuncks to set the PATH correctly.
* Fixed issue #1
* The problem was that the wrapper I wrote to print to the Stata
  console has a maximum buffer size; when it tries to print the
  new PATH it encounters an error when the string is longer than
  the allocated size. Since printing this is unnecessary and
  will only ever be used for debugging, I no longer print the PATH.
* Debugging issue #1
  on github (in particular, `env_set` on Windows).
* Removed old debugging code that had been left uncommented
* Improved out-of-memory message (now links to relevant help section).
@mcaceresb
Copy link
Owner

This should be fixed in version 0.6.10- (master branch).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants