Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ruby / PostgreSQL plug-ins #22

Open
stevengj opened this issue Mar 6, 2015 · 17 comments
Open

Ruby / PostgreSQL plug-ins #22

stevengj opened this issue Mar 6, 2015 · 17 comments

Comments

@stevengj
Copy link
Member

stevengj commented Mar 6, 2015

These were in the original utf8proc, but I removed them in the libmojibake fork to focus on the C components. We can certainly add them back in easily (since the API is backward-compatible), and should distribute them in some form (bundled or separate?) in any case. My inclination is just to bundle them, but since I don't use those languages I'm worried about bitrot unless we can add a testsuite for them.

Are Ubuntu/Debian and Fedora distributing these plugins in their utf8proc packages, or are they just distributing the C library?

@nalimilan
Copy link
Member

ATM the Fedora package does not contain the plug-ins, because I was too busy with other Julia dependencies to care about them.

I'm not sure working on this is really worth it, I'd rather try to get the changes upstream. Of course that doesn't depend only on you...

@ivarne
Copy link

ivarne commented Mar 6, 2015

@nalimilan I added a strike trough to your comment in light of #21

@nalimilan
Copy link
Member

Yeah, fine.

Regarding the topic of the issue, maybe we can ask for help from somebody familiar with Ruby/PostgreSQL on the Julia mailing lists? There must be people like that around.

@pao
Copy link

pao commented Mar 6, 2015

Would there be any way to find current users of those extensions? I guess libutf8proc didn't have a mailing list.

The published Ruby gem is a release behind: https://rubygems.org/gems/utf8proc shows v1.1.5. According to https://rubygems.org/api/v1/gems/utf8proc/reverse_dependencies.yaml there are three gems that depend on it: rubyrdf, mack-localization, and trollface. Neither of the first two have updated since 2009, and the third has been "yanked" from rubygems.org.

@StefanKarpinski
Copy link
Member

I'm inclined to leave that stuff out.

@nalimilan
Copy link
Member

Well, if we reclaim the name utf8proc, removing existing features would not be very nice to existing users. If all that's needed is to bring back the old code, I'd vote for keeping them.

@stevengj
Copy link
Member Author

stevengj commented Mar 6, 2015

@nalimilan, I tend agree at first glance, with the caveat that I don't want to include the code unless we include a test script to validate it.

@stevengj
Copy link
Member Author

stevengj commented Mar 6, 2015

Maybe Ruby has built-in facilities to normalize Unicode strings etc. nowadays? Or is there some other gem that has become the de-facto standard?

e.g. this Ruby blog post mentions several alternatives, but not utf8proc, which makes me think that it's not popular with Ruby users. (As mentioned in #12, it would be good to benchmark utf8proc against unf, which seems to be the fastest library in that Ruby comparison.)

Of course, maybe Ruby people should be using utf8proc, and would be using it if we are more aggressive about maintaining it and promoting it. But someone who actually uses Ruby should be the one to take the lead on that.

@stevengj
Copy link
Member Author

stevengj commented Mar 6, 2015

(Also, I'm not really familiar with Ruby gems, but wouldn't the gem be the right place to maintain the Ruby plugin, rather than bundling it with utf8proc itself?)

@StefanKarpinski
Copy link
Member

I agree, this seems like it just doesn't belong in utf8proc.

@tkelman
Copy link
Contributor

tkelman commented Mar 6, 2015

leaving this link to the commit where these were deleted for future reference: c0f2b51

@PallHaraldsson
Copy link

I know about PostgreSQL (not really Ruby) but do not use its plugins/extension. [I know they made changes in the recent past.]

I found a fork:

https://github.com/vlajos/libmojibake/blob/master/NEWS.md

But, haven't found out how PostgreSQL manages.

I'm not sure with the major version update, should this just be closed?

@tkelman
Copy link
Contributor

tkelman commented Jul 14, 2016

There's also a fork apparently being used by the netsurf browser, ref http://source.netsurf-browser.org/libutf8proc.git/, we should try to reach out to them and see if we can collaborate on maintenance and merge their patches. Evidently Arch and Gentoo have been using them as "upstream" which is a little odd.

@tkelman
Copy link
Contributor

tkelman commented Jul 14, 2016

cc @kyllikki on the above, I think you're the one who authored most or all of those patches? We'd be happy to review pull requests for any features or bug fixes you need.

@nomoon
Copy link

nomoon commented Feb 22, 2017

I've made a small Ruby gem that wraps utf8proc @ https://github.com/nomoon/utf8_proc. It appears to be a bit slower (0.5-2x) than the ubiquitous unf gem in most cases, but passes all the Unicode 9.0 tests while unf hasn't been touched since 6.0.

@stevengj
Copy link
Member Author

@nomoon, that sounds great, thanks for working on this. I will add a link to the utf8proc web page.

stevengj added a commit that referenced this issue Feb 22, 2017
@nomoon
Copy link

nomoon commented Feb 22, 2017

@stevengj Cheers! I've just done some tweaking and am pretty certain it should work fine on any Ruby 2.0+ system that can locate utf8proc's headers and lib. I'm on OSX where that's as easy as brew install utf8proc. I'll see about whether it's feasible to get some Travis testing working with it soon.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

8 participants