Skip to content

smith11235/Tables

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

80 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Generalized Data Analytics

The purpose of this suite is to provide advanced database functionality such as searching, processing, and publishing, on on unperfect domains of data in a trivialized manner for fast prototyping and ease of use by non-technical people.

- manual: by hand (on your phone), aka: keeping a todo list - browser upload: select a formatted file to load records from - you could be a grad researcher, not a software developer - data mine: enter a url, enter xpath queries or regex’s to parse websites - scheduled: add a schedule for polling the url - programmatically: through our logging api from your code base

- rely on our user management - load data through our web site to our databases - rely on our site, but use your database - rely on your own clone of our site and your own database

- prototyping: - the data is of questionable quality - dont deal with mysql’s pickness until you are ready - lossless: - accept and record everything from the loader - skip debugging LOAD statement errors due to schema issues - more like a nosql such as mongo db in this regard - revisioned: - see how each set of data (think: a table in mysql) evolves - don’t lose old records if they are accidentally deleted in updates

- standard statistical information displays - is a field generally unique? - the price of a product - out of 2,000 products, there are 1773 unique prices - is the field a flag/enumerated value field? - the category of a product - 10 distinct values across 2000 products - the distribution of records across this flag

- create filters and indexes on a data set - keys: - define the field and value ranges you care about - link data sets together using keys: - similar to myql join - using selected fields - conditional statements - example: price < 50 - everything is mapped to a url so you can share and re-visit - meant to be useable by a non-technical user

- data set revision validation - statistical proofing based on rules to ensure data quality before release - publish reports to clients under your name - easily - privately - speed optimization: - from your Keys and Views - publish an optimized rails application - with a schema and views defined explicitly for your data domain

About

Interface for logging, visualizing, and researching datasets.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors