consider adding some of these misc links to books/articles/papers/etc #29

mortdeus · 2014-03-06T06:31:55Z

While these aren't all technically "research papers", they are still amazing, exotic and free educational reading resources i've collected over the last few years, so I'll link them just in case you guys want to display some of them.

Operating Systems: Three Easy Pieces

Communicating Sequential Processes (CSP)

Unixca's historical links archive

Programming UNIX Sockets in C - Frequently Asked Questions

C Craft

Ken Thompson Q&A

The Cathedral and the Bazaar

What every programmer should know about memory

Linux Technology Reference

1024cores

switch

command center

asktog

The Art of Unix Programming

stitz zeager math books

That should be enough for right now. I still have a bunch more, so if you guys want me to post more interesting links just let me know you guys are interested via a reply in this issue's comments.

zeeshanlakhani · 2014-03-06T08:17:16Z

Hey @mortdeus. I'm updated our our readme to include your links (issues #25, #26, #27, #28) in this pr #31. Super thankful! This list is also great, and though we're not exactly sure what to do with non-papers as of yet, I took a subset of these and included them in a wiki-page for further expansion, as well as a link to the page from the readme.

We'd definitely add some more as well!

mortdeus · 2014-03-06T08:35:06Z

Lol, soon were going to need something like a Dewey Decimal System to keep everything organized. For example I wasn't even aware you guys already had the CSP paper included when I posted the link above.
If we don't figure something out soon we will run into the situation I have in my Google drive PDF mega-library. (Having to spend hours reorganizing/sorting/cateorgizing/etc)

Any ideas?

zeeshanlakhani · 2014-03-06T08:42:52Z

The reorganizing/sorting/categorizing issue(s) is something we, @papers-we-love/owners, have been discussing for awhile. We plan to add a script(s) and/or hook to help w/ naming (from a pdf's title) and de-duping. And, we're exploring some other options to deal w/ the organizing, which is definitely a larger issue.

No fully-complete answers yet, but we're def. on the same wavelength :).

mortdeus · 2014-03-06T09:08:08Z

You guys also need to consider how to address vetting links to papers to make sure they honor the copywrite license. For example we have to be vigilant and not just allow anybody to post a direct link to a pirated pdf version of the Dragonbook (aka Compilers: Principles, Techniques, and Tools) hosted and shared from their personal Google Drive storage.

Also have you guys made sure you are honoring the licenses of the papers being distributed in this repo?

mortdeus · 2014-03-06T09:18:11Z

For example consider the license terms for /distributed_systems/the_google_file_system.pdf

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for proﬁt or commercial advantage and that copies bear this notice and the full citation on the ﬁrst page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior speciﬁc permission and/or a fee. SOSP’03, October 19–22, 2003, Bolton Landing, New York, USA. Copyright 2003 ACM 1-58113-757-5/03/0010 ...$5.00.

If the author's sent github a DMCA takedown notice, I assume the whole git repo (probably forks too) would have to go down with it.

zeeshanlakhani · 2014-03-06T12:44:00Z

Totally agree and good catch. Admittedly, there was a PR that was merged since everything started really taking off and that we needed to vet more throughly. I was planning on doing that this weekend. I am working on a Contributing.md file, as per #19, that will also explicitly mention this.

zeeshanlakhani · 2014-03-06T12:50:14Z

But, you are right, we really must be more tactful in our approach. Obviously, we'd love to add copyright check/parsing into our automated process.

zeeshanlakhani · 2014-03-06T14:59:10Z

That paper's been removed from the history. Contributing update and audit is on the way.

mortdeus · 2014-03-06T15:07:53Z

I'd put a Readme.MD in each folder with links to papers we don't have permission to distribute directly. Similar to the way I posted and formatted the above links in my initial issue post.

zeeshanlakhani · 2014-03-06T15:10:37Z

I'm thinking it'd be better to just remove them and then take a better approach going forward, no? Or do you mean we should keep a README as more of an audit trail, after we remove these papers?

mortdeus · 2014-03-06T16:32:57Z

IMHO, I think it would be better to get rid of the pdfs and add their links to a md file that best categorizes them.

For example, say I have 5 hyperlinks to papers related specifically to operating systems.

The first paper talks about generic operating system architecture design.
The next 2 paper's topic is specifically related to Linux's design,
Then the last 2 papers are related to plan9, but the last is specific to the plan9 derived inferno operating system.

The way I set this up in my google drive fs, is I make a folder called /os, and then I make the folders /os/unix/, /os/unix/linux /os/plan9/ /os/plan9/inferno etc. Then I just put all my pdfs where they belong.

This is basically the same general approach I would recommend you guys take except each folder has a README.md which we insure conforms to a special format layout we have specifically defined so we can build automated tools that perform various different tasks on the file hierarchy. (one tool would be a crawler bot that looks for urls in the README.md files, which it can then test if the link is still valid, etc)

Also, another benefit of using URLs instead of pdfs is the fact that an automated dedup tool could reliably assume that identical links reference the same paper, even when submitted by two different contributors. However if the same tool was trying to look for duplicate pdfs by it's filename, the tool wont be able to tell, strictly by name, whether or not/foo/$TITLE.pdf and /bar/$TITLE.pdf are references to the same paper. Anytime the tool finds two pdfs that share the same $TITLE, it would have to compare the pdf's contents before the tool's automated removal of files it suspects are dupes is reliable enough for us to trust.

DarrenN · 2014-03-07T13:28:17Z

All interesting points -

dedup: we can compare sha1's of a PDF, not perfect but better than comparing filenames. In combination with filenames should be fairly robust.

copyright - agree that we need to be vigilant about copyright, but also want papers to be as accessible as possible. Would rather err on the side of having less papers with clear copyright than more papers than we can audit with murky licensing status. This is why all PRs now require at least two +1s and if there is a question we can require more thorough auditing.

The core focus (for now) of the repo is to provide access to foundational computer science papers that are alluded to often, but difficult to find. Linking to foundational books is cool, if they're just links and we can make some effort to check their legal status before dropping them into a wiki page.

zeeshanlakhani · 2014-03-07T15:13:14Z

100% agreed @DarrenN.

zeeshanlakhani · 2014-03-07T19:50:32Z

I will stay that we're planning to draft-up a combination plan going forward @mortdeus. We'll be auditing the current set of papers. I still think there's a good amount that can stay in the repo and warrant staying there. For those in murkier territory or just allow for the URL link, we'll take something like your approach to handle those cases, which should create a good balance of resources.

zeeshanlakhani · 2014-03-08T06:46:37Z

Closing this for now, but we can continue discussion if need be.

zeeshanlakhani closed this as completed Mar 8, 2014

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

consider adding some of these misc links to books/articles/papers/etc #29

consider adding some of these misc links to books/articles/papers/etc #29

mortdeus commented Mar 6, 2014

zeeshanlakhani commented Mar 6, 2014

mortdeus commented Mar 6, 2014

zeeshanlakhani commented Mar 6, 2014

mortdeus commented Mar 6, 2014

mortdeus commented Mar 6, 2014

zeeshanlakhani commented Mar 6, 2014

zeeshanlakhani commented Mar 6, 2014

zeeshanlakhani commented Mar 6, 2014

mortdeus commented Mar 6, 2014

zeeshanlakhani commented Mar 6, 2014

mortdeus commented Mar 6, 2014

DarrenN commented Mar 7, 2014

zeeshanlakhani commented Mar 7, 2014

zeeshanlakhani commented Mar 7, 2014

zeeshanlakhani commented Mar 8, 2014

consider adding some of these misc links to books/articles/papers/etc #29

consider adding some of these misc links to books/articles/papers/etc #29

Comments

mortdeus commented Mar 6, 2014

zeeshanlakhani commented Mar 6, 2014

mortdeus commented Mar 6, 2014

zeeshanlakhani commented Mar 6, 2014

mortdeus commented Mar 6, 2014

mortdeus commented Mar 6, 2014

zeeshanlakhani commented Mar 6, 2014

zeeshanlakhani commented Mar 6, 2014

zeeshanlakhani commented Mar 6, 2014

mortdeus commented Mar 6, 2014

zeeshanlakhani commented Mar 6, 2014

mortdeus commented Mar 6, 2014

DarrenN commented Mar 7, 2014

zeeshanlakhani commented Mar 7, 2014

zeeshanlakhani commented Mar 7, 2014

zeeshanlakhani commented Mar 8, 2014