Skip to content
This repository

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP
Browse code

first cut at exported data

  • Loading branch information...
commit ac679b5dfff6bc1e6c68bcc9ec1d6d75151e4bbc 1 parent baa5cbd
Mike Champion graysky authored

Showing 5 changed files with 19,469 additions and 2,766 deletions. Show diff stats Hide diff stats

  1. +48 13 README.md
  2. +30 5 importer.rb
  3. +2,748 2,748 items.txt
  4. +14,215 0 reviews.txt
  5. +2,428 0 screenshots.txt
61 README.md
Source Rendered
... ... @@ -1,25 +1,60 @@
1 1 # oneforty app data
2 2
3   -Future home of open data about 4,000+ social media applications.
  3 +Open data on about 4,000 social media apps and related screenshots and reviews from oneforty.com.
4 4
5   -# License
  5 +
  6 +## License
6 7
7 8 This data is licensed under the Creative Commons Attribution 3.0 license (http://creativecommons.org/licenses/by/3.0/).
8 9
9 10 It includes a requirement that
10 11
11 12
12   -# Data
13   -
14   -- active Apps
15   --- id
16   --- name
17   --- desc
18   --- tagline
19   --- url
20   --- created_at
21   -
22   -# Opening in Excel
  13 +## Data and Schema
  14 +
  15 +The following describes the schema for the data files. Each file is tab-delimited and is in utf-8. There is an example importer included that uses Ruby 1.9's CSV library to parse each document.
  16 +
  17 +- items (aka apps)
  18 +-- id - database id of the app.
  19 +-- name - app name.
  20 +-- tagline (optional) - short (140 character) description of the app.
  21 +-- twitter account - if the app registered a related twitter account.
  22 +-- url - URL for the app's homepage.
  23 +-- permalink - generated permalink used as the slug on oneforty's urls.
  24 +-- rank score - an opaque estimate of popularity, ranging from 0.0 - 100.0.
  25 +-- average rating - average rating from users from 0 to 5.0. Note an average rating of 0.0 means no ratings (since the lowest rating is a 1 star)
  26 +-- created at - UTC timestamp when the app was created.
  27 +-- platform1/platform2/platform3 - list of up to 3 platforms (iPhone, Mac, etc). List of platforms is a fixed set.
  28 +-- category1/category2/category3 - list of up to 3 categories (Clients, Analytics, etc). List of categories is a fixed set.
  29 +-- tags - User-generated tags seperated by a comma. Free-form values.
  30 +-- developer name - if known, the name of the app developer
  31 +-- developer twitter - if known, the twitter handle of the app developer
  32 +-- description - long form app description.
  33 +
  34 +Many items have an accompanying icon (or logo) in the images/items directory. They are named like [item id]_[style].png where style is either "thumb" (100x100) or "original" (no fixed dimensions).
  35 +
  36 +- reviews
  37 +-- id - database id of the review.
  38 +-- item_id - id of the reviewed app.
  39 +-- reviewer name - name of the reviewer.
  40 +-- reviewer twitter - twitter handle of the reviewer
  41 +-- rating - rating from 1 to 5. Note: rating is optional.
  42 +-- quality score - higher score indicates it was valuable to other users. A score of 0 is neutral. Many spammy reviews have been removed already.
  43 +-- created at - UTC timestamp.
  44 +-- review - long form body of the review.
  45 +
  46 +- screenshots - developer or user supplied app screenshots.
  47 +-- id - database id of the screenshot
  48 +-- item_id - id of the screenshoted app.
  49 +-- title - caption for the screenshot.
  50 +-- content type - content type of the screenshot (ex image/png, image/jpeg)
  51 +-- original file name - file name of the "original"-sized screenshot in images/screenshots
  52 +-- thumb file name - file name of the "thumb"-sized (100x100) screenshot in images/screenshots
  53 +-- large file name - file name of the "large"-sized (400x400) screenshot in images/screenshots
  54 +-- created at - UTC timestamp.
  55 +
  56 +
  57 +### Opening in Excel
23 58
24 59 1. Import the .txt file into Excel
25 60 2. In Text Import Wizard, on 1st screen choose "Delimited"
35 importer.rb
@@ -10,20 +10,45 @@ def items
10 10 i = 0
11 11 CSV.foreach("items.txt", csv_options) do |row|
12 12 h = row.to_hash
13   - puts h.inspect
  13 + #puts h.inspect
14 14 i += 1
15 15 end
16 16
17   - puts "Imported #{i} entries."
  17 + puts "Processed #{i} items."
18 18 end
  19 +
  20 + def reviews
  21 + i = 0
  22 + CSV.foreach("reviews.txt", csv_options) do |row|
  23 + h = row.to_hash
  24 + #puts h.inspect
  25 + i += 1
  26 + end
  27 +
  28 + puts "Processed #{i} reviews."
  29 + end
  30 +
  31 + def screenshots
  32 + i = 0
  33 + CSV.foreach("screenshots.txt", csv_options) do |row|
  34 + h = row.to_hash
  35 + #puts h.inspect
  36 + i += 1
  37 + end
19 38
  39 + puts "Processed #{i} screenshots."
  40 + end
  41 +
20 42 protected
21 43
22 44 def csv_options
23   - {:col_sep => "\t", :headers => :first_row, :quote_char=>'"', :skip_blanks => true, :encoding => "u"}
  45 + # Tab delimited
  46 + {:col_sep => "\t", :headers => :first_row, :quote_char=>'"', :skip_blanks => true, :encoding => "utf-8"}
24 47 end
25 48
26 49 end
27 50
28   -# Run the importer
29   -Importer.new.items()
  51 +# Run the importers
  52 +Importer.new.items()
  53 +Importer.new.reviews()
  54 +Importer.new.screenshots()
5,496 items.txt
2,748 additions, 2,748 deletions not shown
14,215 reviews.txt
14,215 additions, 0 deletions not shown
2,428 screenshots.txt
2,428 additions, 0 deletions not shown

0 comments on commit ac679b5

Please sign in to comment.
Something went wrong with that request. Please try again.