Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

std.uni.byGrapheme & Grapheme.opSlice requires obscure REF parameters. #10402

Open
dlangBugzillaToGithub opened this issue Jan 6, 2020 · 2 comments
Labels
Arch:x86_64 Issues specific to x86_64 OS:Windows Issues Specific to Windows Severity:Enhancement

Comments

@dlangBugzillaToGithub
Copy link

robert.muench reported this on 2020-01-06T08:25:26Z

Transfered from https://issues.dlang.org/show_bug.cgi?id=20483

CC List

  • hsteoh
  • sascha.orlov

Description

There is the "What type does byGrapheme() return?" thread in d.learn from which I copy different statements and findings:

```
string r1 = "Robert M. Münch";
	// Code-Units  = 16
	// Code-Points = 15
	// Graphemes   = 15

Grapheme[] gr1 = r1.byGrapheme.array;
writeln(" Text = ", gr1.map!(g => g[]).joiner.to!string);
	//  Text = obert M. Münch
writeln("wText = ", gr1.map!(g => g[]).joiner.to!wstring);
	//  wText = obert M. Münch
writeln("dText = ", gr1.map!(g => g[]).joiner.to!dstring);
	//  dText = obert M. Münch
```

Looks like when you use .map over the Grapheme, it gets copied into a temporary, which gets invalidated when map.front returns.

compiling with dmd -dip1000 produces this error message:

	test.d(15): Error: returning g.opSlice() escapes a reference to parameter g, perhaps annotate with return
	/usr/src/d/phobos/std/algorithm/iteration.d(499):        instantiated from here: MapResult!(__lambda1, Grapheme[])
	test.d(15):        instantiated from here: map!(Grapheme[])

Not the most helpful message (the annotation has to go in Phobos code, not in user code), but it does at least point to the cause of the problem.

The original example is fixable by putting "ref" in for all the lambdas. But this is kind of disturbing. Why does the grapheme do this? The original data is not scoped.

e.g.:

```
writeln(" Text = ", gr1.map!((ref g) => g[]).joiner.to!string);
```

The fact that a Grapheme's return requires you keep the grapheme in scope for operations seems completely incorrect and dangerous IMO (note that operators are going to always have a ref this, even when called on an rvalue). So even though using ref works, I think the underlying issue here really is the lifetime problem.

Investigate whether Grapheme.opSlice can be implemented differently, such that we avoid this obscure referential behaviour that makes it hard to work with in complex
code.
@dlangBugzillaToGithub
Copy link
Author

sascha.orlov commented on 2020-01-06T11:27:01Z

for the sake of completeness
https://forum.dlang.org/thread/qu5et2$1f0h$1@digitalmars.com

@dlangBugzillaToGithub
Copy link
Author

hsteoh commented on 2020-01-08T20:05:44Z

The problem is caused by the implementation of Grapheme.opSlice(), which returns a proxy range object that contains a pointer to the parent Grapheme.  A cursory glance at the code didn't show any obvious reason why this was done this way; conceivably the proxy object returned by .opSlice could just copy the necessary data from Grapheme instead of retaining a pointer to it.

This implementation choice makes it hard to work with Graphemes in range-based pipelines. If anywhere in the pipeline a Grapheme is generated on-the-fly, it will generally be returned from .front as an rvalue, and therefore even the ref trick mentioned in this bug will no longer work.

I recommend to change the implementation of .opSlice so that it does not retain a pointer to the parent Grapheme.

@LightBender LightBender removed the P4 label Dec 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Arch:x86_64 Issues specific to x86_64 OS:Windows Issues Specific to Windows Severity:Enhancement
Projects
None yet
Development

No branches or pull requests

2 participants