Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use a Git mirror of CLDR to get rid of Subversion dependency #60

Closed
foolip opened this issue Jan 29, 2016 · 13 comments
Closed

Use a Git mirror of CLDR to get rid of Subversion dependency #60

foolip opened this issue Jan 29, 2016 · 13 comments

Comments

@foolip
Copy link
Member

foolip commented Jan 29, 2016

I've set up https://github.com/foolip/cldr-data and a cron job using https://github.com/foolip/cldr-data-updater.

Edit: These repos have been removed, let me know if you want them back for some reason.

For me, checking out using svn takes 55 seconds, while a --depth 1 clone using git takes 32 seconds. Not amazing, but depending on how we do it, the incremental updating should be faster.

Does everyone hate git submodules? I think it'd be kind of nice to get all dependencies into Git and explicitly update dependencies, even if it's done by a roll bot.

@foolip
Copy link
Member Author

foolip commented Jan 29, 2016

See #61 for one crappy way to do it, but let's talk about it here first.

@annevk
Copy link
Member

annevk commented Jan 29, 2016

They are always a bit of a pain for web-platform-tests. I wonder how often the output actually changes. Perhaps it's more reasonable that we do this in a different way.

@foolip
Copy link
Member Author

foolip commented Jan 29, 2016

You mean submodules are annoying for web-platform-tests? I agree, but mostly because they're nested and I never remember the flags to recursively update.

@domenic
Copy link
Member

domenic commented Jan 30, 2016

I really do hate my experiences with submodules so far, although I do admit they've always been nested. I kind of feel like it would suck for contributors to html too, who have to clone html-build and then worry about submodule stuff. IMO the current setup is not the best, but is not too bad either. #61 seems OK, but I am not sure saving 30 seconds for initial clone is that important. Getting rid of the svn dependency is kind of a big deal though, so maybe it is worthwhile.

BTW doesn't GitHub have some ability to automatically mirror SVN repos? Googling around I cannot find anything, so maybe the instances I am recalling were specifically set up by GitHub itself. Setting one up for CLDR seems like a reasonable ask of GitHub and they might be willing to accommodate us instead of putting the burden on you and your scripts.

@sideshowbarker
Copy link
Contributor

I wonder how often the output actually changes.

Roughly, weekly. Based on looking through https://github.com/foolip/cldr-data/commits/master Sometimes (like recently) more often than weekly but sometimes less often.

Perhaps it's more reasonable that we do this in a different way.

Yeah, even though it is changing weekly—that is, relatively frequently—I still think we’d be helping contributors more by not requiring them to rebuild it every time, or at all. I think instead we should just rebuild the cldr.inc file ourselves from the sources, and check it into the repo. Same for the entities-dtd.url file (which is the ultimate build product from the unicode.xml file, I think).

@annevk
Copy link
Member

annevk commented Jan 30, 2016

Even with that repository weekly, how often does the HTML standard change as a result? Perhaps there is a way to define this imperatively in the HTML standard on top of the CLDR data. I would not expect implementers to change their implementation every week or copy-and-paste the output from what this generates (which I think is only :lang stuff).

@foolip
Copy link
Member Author

foolip commented Jan 30, 2016

Is there some overlap between this data and unicode.xml? I suspect that most software projects actually use ICU for this kind of stuff.

At any rate, I think we could improve our dependency handling, requiring fewer installed tools (like svn) to build and most likely also speed up builds when nothing has change, making the -n flag less necessary in day-to-day development.

Maybe a single html-build-deps repository that contains everything needed, and which is updated by cron jobs or similar? Then, use shallow clones or the "Download ZIP" feature so that people don't have to pull in the full history of that repo. The only way to make local builds require no network access would be to track html-build-deps by commit, which would mean using submodules or a DEPS file, which would be about the same thing.

@foolip
Copy link
Member Author

foolip commented Jan 30, 2016

One reason I started to look at this is because I keep tripping over "Build tools have been updated since last run; clearing the cache" and fetching everything is a bit slow here from Hanoi. Git knows how to compare the stuff you have to the stuff you need and avoid re-transferring everything :)

@foolip
Copy link
Member Author

foolip commented Jan 30, 2016

And if only the final cldr.inc file is in html-build-deps, then that would also remove the dependency on the XML::Parser Perl module. That and svn is what I keep having to install on machines where I try to build HTML.

@foolip
Copy link
Member Author

foolip commented Jan 30, 2016

And unzip...

@domenic
Copy link
Member

domenic commented Jan 30, 2016

I think I'd like the built intermediate products (cldr.inc?) to be included in the whatwg/html repo, with a bot that sends us pull requests to update them when they change (or commits directly). As @annevk points out, even if CLDR changes frequently, I am not sure the relevant parts of the spec do.

@foolip
Copy link
Member Author

foolip commented Jan 30, 2016

Just dumped some ideas in #62, let's discuss how to deal with it all.

@foolip
Copy link
Member Author

foolip commented Feb 2, 2016

Per #62 (comment) I think we'll want to get rid of this dependency at build-time, or entirely, so no need to switch to a Git mirror until then.

@foolip foolip closed this as completed Feb 2, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

4 participants