Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add /libraries #16

Closed
bebraw opened this issue Feb 26, 2014 · 49 comments
Closed

Add /libraries #16

bebraw opened this issue Feb 26, 2014 · 49 comments
Assignees

Comments

@bebraw
Copy link
Contributor

bebraw commented Feb 26, 2014

It would be potentially interesting to add a library oriented resource. Example:

GET /v2/libraries?normname=jquery/jquery (need to use a normalized name here to avoid name clashes (just see all color libs))

[
    {
        "normname": "jquery/jquery",
        "_id": "ey8mGL8Bp7q",
        "created": "2014-02-25T08:48:09.270Z",
        "description": "jQuery is a fast and concise JavaScript Library that simplifies HTML document traversing, event handling, animating, and Ajax interactions for rapid web development. jQuery is designed to change the way that you write JavaScript.",
        "homepage": "http://jquery.com/",
        "github": "https://github.com/jquery/jquery",
        "author": "jQuery Foundation",
        "cdns": {
            "jsdelivr": {
                "name": "jquery",
                "mainfile": "jquery.min.js",
                "zip": "jquery.zip",
                "lastversion": "2.1.0",
                "versions": [
                    "2.1.0",
                    "2.0.3",
                    "2.0.2",
                    "2.0.1",
                    "2.0.0",
                    "1.11.0",
                    "1.10.2",
                    "1.10.1",
                    "1.10.0",
                    "1.9.1",
                    "1.9.0",
                    "1.8.3",
                    "1.7",
                    "1.4.4"
                ],
                "assets": [
                    {
                        "version": "2.1.0",
                        "files": [
                            "jquery.js",
                            "jquery.min.js",
                            "jquery.min.map"
                        ]
                    },
                    ...
                ]
            },
            "cdnjs": {...},
            "google": {...}
        }
    }
]

This simply aggregates the current information in a different format. In the beginning you can see common meta (name, description etc.). CDN specific information is stored to cdns.

Possible additional benefit of the scheme is that it would allow us to aggregate libraries not available in any CDNs yet to the index.

@shahata
Copy link

shahata commented Feb 26, 2014

Not sure I like this format. It creates some confusion about what data is 'CDN agnostic'. For example, shouldn't 'mainfile' be inside each CDN? Also, I think that everyone who uses those API's will probably need to do some level of processing, so maybe it is better if the format for this API will be identical to the format of specific CDN queries. Just my two cents.

@bebraw
Copy link
Contributor Author

bebraw commented Feb 26, 2014

@shahata Good point about mainfile. CDN agnostic data == package.json/bower.json? This removes some redundancy and provides metadata for CDNs that are missing it entirely (ie. BootstrapCDN, jQuery CDN).

If I return just matching cdns, it will remove most of search and filtering power from the API. Technically what you are proposing would be simpler, though. Let's see if we get more opinions on this. :)

I'm adding your scheme here so it's easier for others to contribute without having to check #9.

http://api.jsdelivr.com/v1/jsdelivr/libraries --> all libraries have {cdn: 'jsdelivr'}
http://api.jsdelivr.com/v1/google/libraries --> all libraries have {cdn: 'google'}
http://api.jsdelivr.com/v1/cdnjs/libraries --> all libraries have {cdn: 'cdnjs'}

@tomByrer
Copy link
Contributor

http://api.jsdelivr.com/v1/jsdelivr/libraries --> all libraries have {cdn: 'jsdelivr'} http://api.jsdelivr.com/v1/google/libraries --> all libraries have {cdn: 'google'} http://api.jsdelivr.com/v1/cdnjs/libraries --> all libraries have {cdn: 'cdnjs'}

I am not too excited by this. Perhaps it is my time using spreadsheet pivot tables; where you don't re-list redundant data to reduce noise. But maybe I'm missing a use-case; why you NEED to the extra data (memory & bandwidth) if you already specified the dataset you're querying please?

Otherwise, looks cool to me, cheers!

@tomByrer
Copy link
Contributor

maybe I'm missing a use-case

Could make it easier for v2 queries, but harder for v1 queries upgrading to v2. Maybe allow including the 'cdns' via fields?
So http://api.jsdelivr.com/v2/jsdelivr/libraries/jquery will have the same output as v2, but http://api.jsdelivr.com/v2/jsdelivr/libraries?name=jquery&fields=mainfile,name,cdns,lastversion,assests would output

[
  {
    "mainfile": "jquery.min.js",
    "name": "jquery"
    "cdns": {
            "jsdelivr": {
                "2.1.0": [
                   "jquery.js",
                    "jquery.min.js",
                    "jquery.min.map"
                 ]
            },
      }
  }
]

@bebraw
Copy link
Contributor Author

bebraw commented Mar 17, 2014

I would rather let /libraries in v2 return something really simple. What if cdn field was just an array of CDN names that provide the library? Then if you are interested in that data you can perform another lookup against cdnname/libraries to get the assets related to that cdn.

@tomByrer
Copy link
Contributor

For some lookups, a simple CDN array would be great, but for others it would be a total of 2-5 (assuming other CDNs are added later) API requests, time to wait, merging of arrays, etc.

@bebraw
Copy link
Contributor Author

bebraw commented Mar 17, 2014

For some lookups, a simple CDN array would be great, but for others it would be a total of 2-5 (assuming other CDNs are added later) API requests, time to wait, merging of arrays, etc.

That's true. I guess it makes sense to return CDN info too.

@tomByrer
Copy link
Contributor

tomByrer commented Apr 2, 2014

I found a solid use-case: https://duck.co/duckduckhack/ddh-intro
So using the API, I can make a widget so when someone searches for CDN {libraryname}, the DuckDuckGo search reply will come back with the results of CDNs & files available on top of the search.

@jimaek
Copy link
Member

jimaek commented Apr 2, 2014

It would be super cool to become a source for DuckDuckGo.
@bebraw Do you think we can do that?

@bebraw
Copy link
Contributor Author

bebraw commented Apr 2, 2014

@jimaek Why not? Maybe try contacting DuckDuckGo? Possibly they can provide some further ideas etc.

@jimaek
Copy link
Member

jimaek commented Apr 2, 2014

I don't see a lot of activity here https://duck.co/ideas
So I guess we could either send an email or just develop it and submit it.

@tomByrer
Copy link
Contributor

tomByrer commented Apr 2, 2014

I was planning on doing it. Just waiting on v2 to stabilize a bit more.

@jimaek
Copy link
Member

jimaek commented Apr 2, 2014

@tomByrer Awesome, do you want me to create a repo for you in jsDelivr org? Or you prefer doing it as your personal project?
Let me know if I can help with anything.

@bebraw
Copy link
Contributor Author

bebraw commented Apr 2, 2014

@tomByrer Any specific issues to resolve in mind? I have some work to do right now but I guess some sort of prototype would be in order.

@tomByrer
Copy link
Contributor

tomByrer commented Apr 2, 2014

Thanks for asking guys!
DDG has their own repo for consuming APIs. Hmmm, I would a repo in github.com/jsdelvr/duckduckgo would be best; that way you guys can be admins in case I fall off the face of the earth? Or I just use my own name space? I never had rights in a separate repo; don't want to mess with extra logins right now; tying to keep things simple.

I have 2 ongoing projects (one of which is to find a paid gig/job), so 1st phase is brain-storming & asking DuckDuckGo questions. Then I'll juggle progress between projects.

Since this is a real-life use-case, this project can also help hone v2 API. The unified CDN search can be key helpful; that's why I'm posting here.

There is also "islands" for Yandex which is similar, but may have an specialized interactive form to hone searches. I think I'll do that after DuckDuckGo, since Yandex's can be more complex, but I'd like to keep my brainstorming & progress public...

TIA

@jimaek
Copy link
Member

jimaek commented Apr 2, 2014

I never had rights in a separate repo; don't want to mess with extra logins right now;

There are no extra logins. I will just give you admin rights on that repo and you will be able to do whatever you want there. Same account, same login. Interested?

Regarding extra API feature you can discuss that with @bebraw

Let us know if you need anything.

@tomByrer
Copy link
Contributor

tomByrer commented Apr 2, 2014

Thanks, a /jsdelivr/duckduckgo repo would help to get more involved.

Let me see if DDG caches API requests, then specific needs can be figured out from there.

@jimaek
Copy link
Member

jimaek commented Apr 2, 2014

@tomByrer Done. You are now member of jsDelivr Organization and admin of https://github.com/jsdelivr/duckduckgo

@tomByrer
Copy link
Contributor

tomByrer commented Apr 2, 2014

Thanks! Party time 🍰

@tomByrer
Copy link
Contributor

tomByrer commented May 5, 2014

@bebraw how is this going? I was looking into merging the CDNs myself, but I saw it would not be an easy pivot.

@bebraw
Copy link
Contributor Author

bebraw commented May 5, 2014

I haven't touched it as I have been busy with paying work. When do you need it?

@tomByrer
Copy link
Contributor

tomByrer commented May 5, 2014

Thanks for the reply. I'm somewhat busy also; I'll play with other things &/or figure out a way to merge the datasets until you have the time.

Do you have a particular dev stack I need to use to run this locally please? Or is packages.json good to go? (not sure if something important is .gitignore'd.)

@bebraw
Copy link
Contributor Author

bebraw commented May 5, 2014

  1. npm install
  2. ./serve.js

and you are good to go. It will take a while to build the in-memory database for each target so you might get blank results for a while.

If you need the aggregate fast, maybe it makes most sense just to build another bit in front of the API? Then it becomes just a simple mapping problem and you don't need to care about the insides of the API.

@tomByrer
Copy link
Contributor

tomByrer commented May 6, 2014

Maybe open a separate issue for that?

I could do that, or

  • Google file listing
  • Google file versions
  • Google GitHub
  • JQuery GitHub
    ...

Do you want all in here, 1 Issuer per CDN, or 1 issue per item (15+)?

Google would be missing some of those files

Perhaps, actually one of my use-cases: help people & scripts find a particular version regardless of what CDN it is on.
I'm not sure if jquery-2.0.3.js an official filename.
& I"m not sure if Google has all *.min.js *.min.map files. That would require another script/time to test.

@bebraw
Copy link
Contributor Author

bebraw commented May 6, 2014

@tomByrer Ok. How about I generate the Google assets for you and we continue from there?

@tomByrer
Copy link
Contributor

tomByrer commented May 6, 2014

sure, thanks. I wasn't expecting anything from you aside from planning for a while; take your time please.

@bebraw
Copy link
Contributor Author

bebraw commented May 6, 2014

@tomByrer Done. Demo: http://api.jsdelivr.com/v1/google/libraries?name=AngularJS

Note that there's a slight irregularity on the paths on Google side at Dojo (dojo/1.9.3/dojo/dojo.js). Others seem regular.

@tomByrer
Copy link
Contributor

tomByrer commented May 6, 2014

Cool ty looks nice so far.
Oh, I wasn't expecting a v1 for this, but that's up to you!
Though I hope there will be a per-version mainfile in v2 ;)

@bebraw
Copy link
Contributor Author

bebraw commented May 6, 2014

Though I hope there will be a per-version mainfile in v2 ;)

Note that will be possible only for jsdelivr. In other cases it would be always the same as they don't provide this sort of info.

@tomByrer
Copy link
Contributor

tomByrer commented May 6, 2014

jsDelivr mainfile = CDNJS filename :)

@bebraw
Copy link
Contributor Author

bebraw commented May 6, 2014

jsDelivr mainfile = CDNJS filename :)

Yes. That's how I'm mapping it right now.

What I'm saying is that in their case it would always point at the same mainfile regardless of the version.

I'm guessing we can do version specific mainfile only for jsdelivr. Doing the same for others seems a little redundant since it's always the same. In jsdelivr's case the top level mainfile should probably point at the mainfile of the latest available version.

@tomByrer
Copy link
Contributor

tomByrer commented May 6, 2014

it would always point at the same mainfile regardless of the version.

I'm thinking of jsDelivr superseded other CDNs's "mainfile" if both have the same filename/version. I'm indifferent how you do it for CAN-specific queries, but that's how I plan to consume it.

I pinged CDNJS about the mainfile issue.

jsdelivr's case the top level mainfile should probably point at the mainfile of the latest available version

Yes, I assumed that would be best also, assuming you keep historical data as new versions are added?

@bebraw
Copy link
Contributor Author

bebraw commented May 6, 2014

Yes, I assumed that would be best also, assuming you keep historical data as new versions are added?

The historical data doesn't matter. Latest is latest with or without history. :)

@tomByrer
Copy link
Contributor

tomByrer commented May 6, 2014

Sorry I didn't explain well. I mean you store what the info.ini's mainfile is currently as part of the 'version:' array. If it changes later, then the prior version's mainfile remain, & the current & future versions use the newly updated mainfile. I hope I explained it better.

@bebraw
Copy link
Contributor Author

bebraw commented May 6, 2014

Sorry I didn't explain well. I mean you store what the info.ini's mainfile is currently as part of the 'version:' array. If it changes later, then the prior version's mainfile remain, & the current & future versions use the newly updated mainfile. I hope I explained it better.

In jsdelivr's case the sync process takes a look at the directory structure and reconstructs the information every time it syncs. So in case changes are made to any info.ini it will pick them up.

@tomByrer
Copy link
Contributor

tomByrer commented May 6, 2014

reconstructs the information every time it syncs

Yea I was afraid a total database rebuild happened every time your ran your API script. So really we will [eventually?] need that mainfile info inside every directory.
I hope that build script runs well when jsDelivr hits 10k repos, with an average of 30 releases in each one.

@bebraw
Copy link
Contributor Author

bebraw commented May 6, 2014

Yea I was afraid a total database rebuild happened every time your ran your API script. So really we will [eventually?] need that mainfile info inside every directory.

If execution time or something becomes a problem, it's possible to construct the db only once in a while and rely on GitHub pubsubhubbub for changes. That will pretty much resolve the issue.

@fulldecent
Copy link

Related: #48 Explicit documentation of asset URLs

@tomByrer
Copy link
Contributor

tomByrer commented Sep 4, 2014

As an FYI, one way I was planning to demo v2 is to use a JSON DB PaaS like https://www.firebase.com/ were we can mock how the JSON should be served first, then actually program it. Firebase is really handy for quick demos, & easy to climb the JSON tree via the web interface.

@tomByrer
Copy link
Contributor

Further v2 plans if I get to it before others:

  1. Rewrite fully or partially api-sync so output conforms to a single standard format. Likely will solve a few issues.
  2. If this API needs editing more than a few LOC, I'll likely will restart using StrongLoop.

@megawac megawac self-assigned this Jun 1, 2015
@jimaek
Copy link
Member

jimaek commented Jun 1, 2015

Due to lack of bulletproof normalized names I think we can close this issue as its very hard to do.

@jimaek jimaek closed this as completed Jun 1, 2015
@megawac
Copy link
Contributor

megawac commented Jun 1, 2015

Do you think this cannot be done well enough through some hybrid matching:

jquery.x ~~ jquery-x ~~ jquery+x

@jimaek
Copy link
Member

jimaek commented Jun 1, 2015

yes, we have some pretty big differences in some projects. Some are hosted only on a single CDN as well. I dont see a solution to every single problem we would have

@tomByrer
Copy link
Contributor

tomByrer commented Jun 1, 2015

Do you think this cannot be done well enough through some hybrid matching:

Or just use all the names in all available CDNs (array)? Might make some projects easier to find.

@jimaek
Copy link
Member

jimaek commented Jun 2, 2015

What you say sounds more like a /search endpoint that allows the search through all the CDNs we have. But it wont work for the main endpoint to get a project's info.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants