-
Notifications
You must be signed in to change notification settings - Fork 60
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Display CSV table(s) as search-able HTML on GitHub Pages #8
Comments
This is a fantastic idea, will make it easier for more people to dive into the data. |
I tried implementing this for Here's a version with only the first 9 entries from the data file. It works fine. Here's a version that tries to load the full thing. My browser isn't able to load it (Chrome 50.0.2661.1 on OS X 10.11.3 on a mid 2015 rMBP). |
Any simple alternatives? or should we not bother? Anyone who wants to can import the CSV into Excel or Google Sheets or your preferred table-searching tool. I’m just going to close this issue now, but feel free to open it up if you want to try another approach. |
Hey all, yah this file is 52k rows long (and 13mb), which is way too much to render on one page with my javascript template. A google sheet would probably be the easiest approach. |
I think a Google Sheet is a fantastic idea. Plenty of people don't know what a |
Re-opening! Instead of a searchable table for all of This is much smaller and more manageable and would be useful. This would be a mirror of the IPRA portal's table, but with many more columns created by linking with the May + April data. Which I believe is what @rajivsinclair may have meant in the first place? Reference: #6 (comment) |
^^ @DGalt |
@alexsoble will likely start working on this at hacknight tonight. Just a heads up, though, there's probably still a little cleaning that needs to go on with that csv. I have not looked at the cleaned February data yet, so I don't know what, if anything, should be pulled out of that and added to this. |
Sounds good!
|
@DGalt I plugged your work-in-progress CSV into the template: https://github.com/invinst/ipra-table. It works fine. Takes a couple of seconds to load, and there's a lot of [nan] data, but it's a very serviceable first draft. Made it clear in the commit + repo that this a work in progress and not a final product. There's no public internet URL yet. We can look at this together at HackNight tonight if you want. |
After our discussion last night I think we need to decide whether we want to continue merging the april and may dumps, or whether we want to keep them separate. Looking at the list of 102 CRIDs in the June ipra dataset, 39 of them do not exist in the May data set, 8 of them do not exist in the April data set, and 4 of them do not exist in the data set composed of the combination of May and April. If we're just trying to provide as much information as possible then we should continue merging the two together, but as Chacyln explained last night these are data sets covering two different things. As for your other question @alexsoble about percentage of
Edit: sorted above by percentage |
Thanks @DGalt! |
@DGalt A bunch of the rows have |
Right, so, in the full data set there are likely several rows for any one CRID. The way I've organized this is that each CRID has a row, and each column contains a list of the unique values that were in all of the rows for that CRID (so basically the multiple rows are collapsed into a list for any one column). So in the example you give, that row corresponds to You'll see We should discuss, though, what to do with those |
OK great, thanks for the explanation @DGalt! |
Still WIP, working on a trimmed slice of the data with 5 columns, 28 rows (rows where we have accused officer names): |
how many columns can reasonably be presented before it becomes illegible? i also wonder if these some way to truncate certain columns that are excessively long (e.g. the ones with urls) |
Great question, not sure. More columns, more work for reader. |
The text was updated successfully, but these errors were encountered: