Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deprecate public aliases for Regex, StaticRegex, etc #5964

Closed
wants to merge 2 commits into from

Conversation

dhasenan
Copy link
Contributor

As per the discussion at #5963, we want the option to swap out the types for Regex, Capture, etc. This is impossible if people are able to refer to Regex by name. Furthermore, according to @wilzbach, aliases are not an option. Therefore, this PR deprecates public uses of regex types (Regex, StaticRegex, Captures, and RegexMatch) to ensure we can later swap out the implementation types..

As per the discussion at dlang#5963, we
want the option to swap out the types for Regex, Capture, etc. This
is impossible if people are able to refer to Regex by name.
Furthermore, according to @wilzbach, aliases are not an option.
Therefore, this PR deprecates public uses of the Regex types to allow
us to switch out the concrete type later.
@dlang-bot
Copy link
Contributor

Thanks for your pull request, @dhasenan! We are looking forward to reviewing it, and you should be hearing from a maintainer soon.

Some tips to help speed things up:

  • smaller, focused PRs are easier to review than big ones

  • try not to mix up refactoring or style changes with bug fixes or feature enhancements

  • provide helpful commit messages explaining the rationale behind each change

Bear in mind that large or tricky changes may require multiple rounds of review and revision.

Please see CONTRIBUTING.md for more information.

Bugzilla references

Your PR doesn't reference any Bugzilla issue.

If your PR contains non-trivial changes, please reference a Bugzilla issue or create a manual changelog.

@DmitryOlshansky
Copy link
Member

I think this quickly got out of hand. I think it's fine to keep Regex!Char and in fact I'd drop isRegexFor instead.

StaticRegex!X is indeed on its way out.

Captures!R is a non-standard concept (unlike e.g. forward range) that we are pretty much stuck with.

RegexMatch is indeed just a forward range of Captures so can deprecate that .

public alias StaticRegex = Regex;

private alias _Regex(Char) = std.regex.internal.ir.Regex!(Char);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ugh!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Btw that's another advantage of using auto:
We don't have to pay any initialization cost until the function is called once :)

Copy link
Contributor Author

@dhasenan dhasenan Dec 25, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mean import cost, no? Because "initializers" would be constructors, usually. You never pay for an instance constructor until an instance is created, and you always pay for a static constructor before main().

Anyway, that's easily addressed:

template _Regex(Char)
{
  private import std.regex.internal.ir : IRRegex = Regex;
  alias _Regex = IRRegex!Char;
}

Pretty sure this technique has been discussed on the newsgroup several times.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant CTFE "initialization" cost - evaluation would have been a better word.

Pretty sure this technique has been discussed on the newsgroup several times.

Well NG and actual PRs is a huge difference. Here's an overview on the status quo on making Phobos slicker:

tl;dr: there are still a lot of easy fruits to pick :)

@DmitryOlshansky
Copy link
Member

The biggest problem is that we should move docs to package.d so that Regex alias contains DDoc from internal/ir.d

@dhasenan
Copy link
Contributor Author

dhasenan commented Dec 25, 2017

I think this quickly got out of hand.

Thank you! I strive to be efficient.

I think it's fine to keep Regex!Char

But then we can't change regex() to return a value of a different type without a deprecation cycle. Since the consensus seems to be that this is an important thing for us to be able to do, we should get the necessary deprecation cycle started ASAP.

RegexMatch is indeed just a forward range of Captures so can deprecate that .

Not quite. It's a forward range of Captures with extra convenience methods pre, post, and hit. While we could add free functions to be used with UFCS to emulate this if we change to a more widely used type, this would break code that uses selective imports. Selective local import style is being pushed rather strongly.

So we can't switch from RegexMatch to something that's currently more widely used. But we could come up with a type similar to RegexMatch that can be more widely used (eg a more advanced type for searching through random-access ranges). If we did that, I would be able to add an alias for compatibility, but that apparently isn't a universal trait of phobos devs and I can't commit to being available to help out. So that's not an option.

@wilzbach
Copy link
Member

I strive to be efficient.

https://issues.dlang.org/buglist.cgi?component=phobos&list_id=218415&product=D&resolution=---

Any help would be highly appreciated!

Furthermore, according to @wilzbach, aliases are not an option.
...
I would be able to add an alias for compatibility, but that apparently isn't a universal trait of phobos devs

Oh I think you have misunderstood me a bit. I was trying to say that we can't use aliases to change the return type of a function if it's not auto, e.g.:

Bar myFancyPhobosFunction()
{
...
}

and in a later version do the following:

alias Bar = Bar2;

auto essentially means to a developer "hey the return type of this function might change, use typeof(..) if you really need to store my type".

@dhasenan
Copy link
Contributor Author

dhasenan commented Dec 25, 2017

I strive to be efficient.

https://issues.dlang.org/buglist.cgi?component=phobos&list_id=218415&product=D&resolution=---

Any help would be highly appreciated!

Ah, I see the problem. When I said I strive to be efficient, you thought I said that I strive to do whatever you think is important whenever you want, without compensation. I really have no clue how people keep confusing the two, but since they do it a lot, apparently it's an easy mistake to make.

I charge $150 per hour normally, but I offer a 30% open source discount.

I was trying to say that we can't use aliases to change the return type of a function if it's not auto

Some discussion of why this is the case would be salient. Removing one type in favor of another and adding a compatibility alias seems like the obvious way to make a change with a reasonable deprecation cycle.

Since it's apparently better to document the entire interface of your
return types than to use static typing to direct people to the type's
own documentation, this ensures that that documentation is complete.
@schveiguy
Copy link
Member

FYI, here is a PR that fixes just the immediate issue for RegexMatch.front: #5976

auto essentially means to a developer "hey the return type of this function might change, use typeof(..) if you really need to store my type".

This is not correct IMO. Only Voldemort returns make this guaranteed. And even there, it doesn't give you liberty to change the API of the return type.

For instance, if you have:

auto foo(T)(T t)
{
   static if(is(T == int)) return cast(long)t * 2;
   else return t * 2;
}

auto doesn't say we can change the type without notice. It says "I don't want to have to make a template IntOrLong just to write a return type for this"

Copy link
Member

@DmitryOlshansky DmitryOlshansky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we are to go forward with this:

  • tune down the destruction
  • remove 3 documented comments
  • move whatever documentation there is from internal subpackage

Lastly - for contructual work, negotiate with D Language foundation if they can accomodate such.

@@ -557,57 +562,49 @@ public:
}
}
}
///Slice of input prior to the match.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an important part of API actually.

@property R pre()
{
return _nMatch == 0 ? _input[] : _input[0 .. matches[0].begin];
}

///Slice of input immediately after the match.
@property R post()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

assert(!matchFirst("nothing", "something"));
---
+/

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Super useful part of API

Number of pattern matched counting, where 1 - the first pattern.
Returns 0 on no match.
+/

@safe @property int whichPattern() const nothrow { return _nMatch; }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remarkably important part of API that allows you to match many things at once

@safe @property int whichPattern() const nothrow { return _nMatch; }

///
@system unittest
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is plain sabotage - deleting code that shows how to use the above powerful primitive

@property bool empty() const { return _captures._nMatch == 0; }

///Same as !(x.empty), provided for its convenience in conditional statements.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Useful short-cut to be documented.

Effectively it's a forward range of Captures!R, produced
by lazily searching for matches in a given input.
+/
@trusted public struct RegexMatch(R)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Basically of this PR,
only this can be kept

@property R post()
{
return _captures.post;
}

///ditto
@property R hit()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The 2-nd thing to keep from this PR

$(LI captures: compatibility alias for previous versions of std.regex.)
)

Each capture struct offers the following methods:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead just document the Captures struct. This description is at best a skeleton

T opCast(T:bool)(){ return !empty; }

/// Same as .front, provided for compatibility with original std.regex.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The 3-rd thing to keep from this PR

@dhasenan
Copy link
Contributor Author

Instead just document the Captures struct.

We can't use real return types because we need to be able to change out the types returned without a deprecation cycle. But in other places where that's done, we either use a private type or a Voldemort type to make it actually work. Nobody can actually refer to MapResult, so we can switch it out trivially.

That's not the case with Captures, so we need to make it private as soon as possible, since that will require a deprecation cycle and we don't want to have to wait in order to change the return types here.

However, once Captures is private, it won't appear in ddoc. Therefore we need to document it elsewhere.

(Or we could just be honest about our inclination to change return types and stop using auto, but that idea was shot down.)

Anyway, I don't have the patience to deal with open source anymore, so you all can decide what you want. I'm out.

@dhasenan dhasenan closed this Jan 15, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
5 participants