Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Download ZIP

Comparing changes

Choose two branches to see what's changed or to start a new pull request. If you need to, you can also compare across forks.

Open a pull request

Create a new pull request by comparing changes across two branches. If you need to, you can also compare across forks.
base fork: igorman/data-engineering
base: master
...
head fork: gcatlin/data-engineering
compare: gcatlin
Checking mergeability… Don't worry, you can still create the pull request.
  • 10 commits
  • 12 files changed
  • 0 commit comments
  • 3 contributors
Commits on Oct 12, 2011
@bmuller bmuller updated submission instructions 2d5a415
@bmuller bmuller reformatted test input file f0857cb
Commits on Oct 19, 2011
@bmuller bmuller now accepting patches e55a628
@bmuller bmuller cleared up submission instructions for patch files ffc2671
Commits on Oct 26, 2011
@KeeperPat KeeperPat Make the alternate submission instructions (for people who don't want…
… their current employers to know that they're completing our challenge) more prominent. Too many candidates have been submitting pull requests from private repositories which become part of their public activity feed.
09bfd82
Commits on Oct 31, 2011
@bmuller bmuller modified README to include new dev.challenges@ls addy 3d5bc31
@bmuller bmuller Fixed formatting issues in README 44bac2a
Commits on Nov 10, 2011
@bmuller bmuller added associate developer position to list 224e963
Commits on Mar 26, 2012
@gcatlin gcatlin my brilliant submission 6ddc136
@gcatlin gcatlin Updates README 68b019c
View
20 README.markdown
@@ -1,12 +1,9 @@
-# Challenge for Software Engineer - Big Data
-To better assess a candidates development skills, we would like to provide the following challenge. You have as much time as you'd like (though we ask that you not spend more than a few hours) and may use any programming language or framework you'd like. Feel free to email [data.challenge@livingsocial.com](mailto:data.challenge@livingsocial.com) if you have any questions.
+## Setup Instructions
+1. Clone the repo locally.
+1. Make the www/ dir web accessible.
+1. Browse the www/ dir.
-## Submission Instructions
-1. First, fork this project on github. You will need to create an account if you don't already have one.
-1. Next, complete the project as described below within your fork.
-1. Finally, push all of your changes to your fork on github and submit a pull request.
-
-## Project Description
+## The Challenge
Imagine that LivingSocial has just acquired a new company. Unfortunately, the company has never stored their data in a database and instead uses a plain text file. We need to create a way for the new subsidiary to import their data into a database. Your task is to create a web interface that accepts file uploads, normalizes the data, and then stores it in a relational database.
Here's what your web-based application must do:
@@ -22,10 +19,3 @@ Your application does not need to:
1. be aesthetically pleasing
Your application should be easy to set up and should run on either Linux or Mac OS X. It should not require any for-pay software.
-
-## Evaluation
-Evaluation of your submission will be based on the following criteria:
-
-1. Did your application fulfill the basic requirements?
-1. Did you document the method for setting up and running your application?
-1. Did you follow the instructions for submission?
View
BIN  db/db.sqlite
Binary file not shown
View
8 example_input.tab
@@ -1,5 +1,5 @@
purchaser name item description item price purchase count merchant address merchant name
-Snake Plissken $10 off $20 of food 10.0 2 987 Fake St Bob's Pizza
-Amy Pond $30 of awesome for $10 10.0 5 456 Unreal Rd Tom's Awesome Shop
-Marty McFly $20 Sneakers for $5 5.0 1 123 Fake St Sneaker Store Emporium
-Snake Plissken $20 Sneakers for $5 5.0 4 123 Fake St Sneaker Store Emporium
+Snake Plissken $10 off $20 of food 10.0 2 987 Fake St Bob's Pizza
+Amy Pond $30 of awesome for $10 10.0 5 456 Unreal Rd Tom's Awesome Shop
+Marty McFly $20 Sneakers for $5 5.0 1 123 Fake St Sneaker Store Emporium
+Snake Plissken $20 Sneakers for $5 5.0 4 123 Fake St Sneaker Store Emporium
View
3  export_data
@@ -0,0 +1,3 @@
+#!/bin/bash
+
+sqlite3 db/db.sqlite < export_data.sql
View
16 export_data.sql
@@ -0,0 +1,16 @@
+.mode tabs
+.headers on
+
+SELECT
+ purchaser.name as 'purchaser name',
+ item.description as 'item description',
+ item.price as 'item price',
+ purchase.quantity as 'purchase count',
+ merchant.address as 'merchant address',
+ merchant.name as 'merchant name'
+FROM
+ purchase JOIN purchaser USING (purchaser_id),
+ purchase p2 JOIN item USING (item_id),
+ item i2 JOIN merchant USING (merchant_id)
+GROUP BY
+ purchase.purchase_id;
View
61 normalize.php
@@ -0,0 +1,61 @@
+<?php
+
+$base_dir = __DIR__;
+$db = new PDO("sqlite:{$base_dir}/db/db.sqlite");
+$purchaser_insert = $db->prepare("INSERT INTO purchaser (name) VALUES (:purchaser_name)");
+$merchant_insert = $db->prepare("INSERT INTO merchant (name, address) VALUES (:merchant_name, :merchant_address)");
+$item_insert = $db->prepare("INSERT INTO item (merchant_id, price, description) VALUES (:merchant_id, :item_price, :item_description)");
+$purchase_insert = $db->prepare("INSERT INTO purchase (purchaser_id, item_id, quantity) VALUES (:purchaser_id, :item_id, :purchase_count)");
+
+$purchaser_id_map = array();
+$merchant_id_map = array();
+$item_id_map = array();
+
+fgets(STDIN); // discard the header
+while (!feof(STDIN)) {
+ $line = trim(fgets(STDIN));
+ if (!$line) {
+ continue;
+ }
+
+ $fields = explode("\t", $line);
+ $purchaser_name = $fields[0];
+ $item_description = $fields[1];
+ $item_price = $fields[2];
+ $purchase_count = $fields[3];
+ $merchant_address = $fields[4];
+ $merchant_name = $fields[5];
+
+ if (!isset($purchaser_id_map[$purchaser_name])) {
+ $purchaser_insert->execute(array(
+ ':purchaser_name' => $purchaser_name
+ ));
+ $purchaser_id_map[$purchaser_name] = $db->lastInsertId();
+ }
+ $purchaser_id = $purchaser_id_map[$purchaser_name];
+
+ if (!isset($merchant_id_map[$merchant_name])) {
+ $merchant_insert->execute(array(
+ ':merchant_name' => $merchant_name,
+ ':merchant_address' => $merchant_address
+ ));
+ $merchant_id_map[$merchant_name] = $db->lastInsertId();
+ }
+ $merchant_id = $merchant_id_map[$merchant_name];
+
+ if (!isset($item_id_map[$merchant_id][$item_description])) {
+ $item_insert->execute(array(
+ ':merchant_id' => $merchant_id,
+ ':item_price' => $item_price,
+ ':item_description' => $item_description
+ ));
+ $item_id_map[$merchant_id][$item_description] = $db->lastInsertId();
+ }
+ $item_id = $item_id_map[$merchant_id][$item_description];
+
+ $purchase_insert->execute(array(
+ ':purchaser_id' => $purchaser_id,
+ ':item_id' => $item_id,
+ ':purchase_count' => $purchase_count
+ ));
+}
View
6 setup_db
@@ -0,0 +1,6 @@
+#!/bin/bash
+
+rm -rf db
+mkdir -pm 777 db
+sqlite3 db/db.sqlite < setup_db.sql
+chmod 666 db/db.sqlite
View
33 setup_db.sql
@@ -0,0 +1,33 @@
+PRAGMA foreign_keys = 1;
+
+DROP TABLE IF EXISTS purchaser;
+CREATE TABLE purchaser (
+ purchaser_id INTEGER PRIMARY KEY AUTOINCREMENT,
+ name TEXT
+);
+
+DROP TABLE IF EXISTS merchant;
+CREATE TABLE merchant (
+ merchant_id INTEGER PRIMARY KEY AUTOINCREMENT,
+ name TEXT,
+ address TEXT
+);
+
+DROP TABLE IF EXISTS item;
+CREATE TABLE item (
+ item_id INTEGER PRIMARY KEY AUTOINCREMENT,
+ merchant_id INTEGER,
+ price REAL,
+ description TEXT,
+ FOREIGN KEY(merchant_id) REFERENCES merchant(merchant_id)
+);
+
+DROP TABLE IF EXISTS purchase;
+CREATE TABLE purchase (
+ purchase_id INTEGER PRIMARY KEY AUTOINCREMENT,
+ purchaser_id INTEGER,
+ item_id INTEGER,
+ quantity INTEGER,
+ FOREIGN KEY(purchaser_id) REFERENCES purchaser(purchaser_id),
+ FOREIGN KEY(item_id) REFERENCES item(item_id)
+);
View
5 test_import
@@ -0,0 +1,5 @@
+#!/bin/bash
+
+setup_db
+php normalize.php < example_input.tab
+export_data
View
38 www/import.php
@@ -0,0 +1,38 @@
+<?php
+
+$error_map = array(
+ UPLOAD_ERR_INI_SIZE => "The uploaded file exceeds the maximum allowed size of " . ini_get('upload_max_filesize') . " bytes",
+ UPLOAD_ERR_PARTIAL => "The uploaded file was only partially uploaded.",
+ UPLOAD_ERR_NO_FILE => "No file was uploaded.",
+ UPLOAD_ERR_NO_TMP_DIR => "Missing a temporary folder.",
+ UPLOAD_ERR_CANT_WRITE => "Failed to write file to disk.",
+);
+
+$params = array();
+if (isset($_FILES['import'])) {
+ if ($_FILES['import']['error'] === 0) {
+ $file = $_FILES['import']['tmp_name'];
+
+ // This simulates processing the file asynchronously.
+ $base_dir = dirname(__DIR__);
+ $output = shell_exec("php {$base_dir}/normalize.php < {$file}");
+
+ if ($output) {
+ $params['error'] = $output;
+ } else {
+
+ // Revenue calculation could happen in the same step as normalization
+ // but it is a bit cleaner, though less efficient, to make it a
+ // separate step. It should also occur as part of an asynchronous
+ // workflow.
+ $params['revenue'] = shell_exec("awk -F t 'NR>1 {sum += $3 * $4} END {printf \"%.2f\", sum}' {$file}");
+
+ }
+ } else {
+ $params['error'] = $error_map[$_FILES['import']['error']];
+ }
+} else {
+ $params['error'] = $error_map[UPLOAD_ERR_NO_FILE];
+}
+
+header('Location: import_complete.php?' . http_build_query($params));
View
12 www/import_complete.php
@@ -0,0 +1,12 @@
+<html>
+<body>
+<?php if (isset($_GET['revenue'])): ?>
+ <h1>Import Successful!</h1>
+ <p>Revenue: $<?php echo htmlspecialchars($_GET['revenue']) ?></p>
+<?php elseif (isset($_GET['error'])): ?>
+ <h1>Import Failed!</h1>
+ <p><?php echo htmlspecialchars($_GET['error']) ?></p>
+<?php endif; ?>
+<p><a href="index.html">Import another file</a></p>
+</body>
+</html>
View
9 www/index.html
@@ -0,0 +1,9 @@
+<html>
+<body>
+<h1>Import Data File</h1>
+<form enctype="multipart/form-data" action="import.php" method="POST">
+<input name="import" type="file" /><br />
+<input type="submit" value="Import" />
+</form>
+</body>
+</html>

No commit comments for this range

Something went wrong with that request. Please try again.