No description, website, or topics provided.
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
data-gen
public_html
sql
README.md

README.md

Drinker's Guide

See it live

About

This project was done for the undergraduate course Principles of Information and Data Management which covered SQL and relational database design.

Our project is aimed to make drinking in NJ bars safer. We keep track of sex offenders and use that information to determine which bars are the safest to visit and which sex offenders to watch out for. We calculate safety of a bar based on the number of sex offenders/offendees visit, how dangerous they are, and the number of illegal beers they sell. Users of our application can search based on their location in New Jersey to find the safest bars near them.

The data used in this project was generated by us (we gave credit to any tools we used in our code). We generated realistic data an implanted verifiable patterns into the data.

Getting Started

To first get started with data generation you can import one of the following into your SQL database:

  • sql/db_structure.sql this has only the tables of our database and not the actual data. Use this if you want to start from scratch.
  • sql/db_structure_and_data.sql this has both the tables and data

After this, you can modify/use the scripts we used to generate data in the data-gen/ directory.

You can also see the front-end in public_html/.

Notes

Breakdown of data

The data is for one month of October 2013 with the following assumptions:

  • ~5k drinkers
  • ~3k bars
  • ~250 sex offenders
  • Each drinker goes to bar on average 3-4 times
  • Each drinker frequents 2-3 different bars

Data Generation

Patterns

We need to add a couple of patterns to the data. It might be easier to add all of the data initially and then implement the patterns by deleting records which go against the pattern. This way we can just worry about making a shit ton of data for now.

People who leave with underage drinkers are sex offenders

This pattern states that if an older person left with an underage drinker the older person is a sex offender.

After running this is the SQL verification of the pattern:

Logic behind this: Drinkers who left with underage drinkers are sex offenders Find a drinker who did leave with an underage drinker but is not a sex offender I have to check both drinker1 and drinker2 in LeftWith

SELECT CASE WHEN (SELECT COUNT(*) FROM (
	SELECT d1.name
	FROM LeftWith lw, Drinker d1, Drinker d2 
	WHERE lw.drinker1 = d1.name AND d1.age >= 21 AND lw.drinker2 = d2.name and d2.age < 21 AND NOT EXISTS(SELECT * from SexOffender s WHERE s.name = d1.name)
	UNION
	SELECT d1.name
	FROM LeftWith lw, Drinker d1, Drinker d2 
	WHERE lw.drinker2 = d1.name AND d1.age >= 21 AND lw.drinker1 = d2.name and d2.age < 21 AND NOT EXISTS(SELECT * from SexOffender s WHERE s.name = d1.name)
	) A) = 0 THEN 'Yes'
ELSE 'No'
END AS isTrue

Bars which serve illegal beers have sex offenders

Here is the verification of the pattern:

SELECT CASE WHEN 
(SELECT COUNT(*) 
FROM Sells s 
WHERE s.beer IN (SELECT name FROM Beer where manf in (SELECT m.name FROM Manufacturer m,Country c Where m.country = c.name AND prohibition=1))
AND NOT EXISTS (SELECT * FROM SexOffender WHERE bar = s.bar)) = 0 THEN 'Yes'
ELSE 'No'
END AS isTrue

People who frequent international bars like at least one Polish beer

SELECT IF
((SELECT COUNT(f.drinker) FROM Frequents f, Bar b 
WHERE f.bar = b.name AND b.international='1' AND
(SELECT COUNT(*) FROM Likes l WHERE l.drinker = f.drinker AND l.beer IN(
SELECT b.name FROM Beer b, Manufacturer m
WHERE b.manf = m.name AND m.country = 'Poland')) = 0) = 0, 'Yes', 'No') AS verification

Safety Ranking

Ok, so this query gets the number of sex offenders all of the bars have:

SELECT b.name, COUNT(A.name) AS cnt 
FROM Bar b LEFT JOIN (SELECT f.bar AS bar, s.name AS name FROM SexOffender s, Frequents f WHERE s.name = f.drinker) A ON b.name = A.bar GROUP BY b.name ORDER By cnt;

Newer version (get illegal beers and number of sex offenders)

SELECT b.name, COUNT(A.name) AS offCnt, (SELECT COUNT(*)  FROM Sells s  WHERE s.bar = b.name AND s.beer IN (SELECT name FROM Beer where manf in (SELECT m.name FROM Manufacturer m,Country c Where m.country = c.name AND prohibition=1))) AS numIllegal 
FROM Bar b 
LEFT JOIN (SELECT c.bar AS bar, s.name AS name FROM SexOffender s, Consumed c WHERE s.name = c.drinker) A ON b.name = A.bar GROUP BY b.name ORDER BY numIllegal;

Newer version (using victims)

SELECT b.name, round((10 - (COUNT(A.name) + (SELECT COUNT(*)  FROM Sells s  WHERE s.bar = b.name AND s.beer IN (SELECT name FROM Beer where manf in (SELECT m.name FROM Manufacturer m,Country c Where m.country = c.name AND prohibition=1)))) * (10/13)),1) AS rating 
FROM Bar b LEFT JOIN (SELECT c.bar AS bar, s.name AS name FROM SexOffender s, Frequents c WHERE s.name = c.drinker OR s.victim = c.drinker) A ON b.name = A.bar 
GROUP BY b.name;

Yet another (actually this is slower)

SELECT off.name,  round(10 - (off.numOff + ill.numIllegal) * (10/13), 1) as rating 
FROM (SELECT b.name AS name, COUNT(A.name) AS numOff FROM Bar b 
	LEFT JOIN (SELECT c.bar AS bar, s.name AS name FROM SexOffender s, Frequents c WHERE s.name = c.drinker OR s.victim = c.drinker) A ON b.name = A.bar GROUP BY b.name) off 
LEFT JOIN
(SELECT s.bar as name, COUNT(illBeers.name) as numIllegal 
	FROM Sells s 
	LEFT JOIN 
	(SELECT name FROM Beer WHERE manf in (SELECT m.name FROM Manufacturer m,Country c Where m.country = c.name AND prohibition=1)) illBeers 
	ON illBeers.name = s.beer GROUP BY s.bar) ill ON ill.name = off.name

Danger Radius for Sex Offenders

I think this get's the maximum radius for a sex offender:

SELECT so.name, MAX(SQRT(POW(d.latitude - b.latitude,2) + POW(d.longitude - b.longitude, 2))) 
FROM SexOffender so, Drinker d, Frequents f, Bar b 
WHERE so.name = f.drinker AND f.bar = b.name AND d.name=so.name GROUP BY so.name;

Some possibilities:

  • bars with sex offenders have underage drinkers [done 11/17]
  • bars with illegal beers have sex offenders [done 11/17]
  • the higher the alchohol content in the beers which a bar serves correlates to how many sex offenders frequent that bar
  • everybody who frequents an international bar likes Zywiec (we could add a trigger to make sure anyone added into Frequents with an international bar has a corresponding row in the Likes table)
  • bars which serve beers with higher alchohol content are less safe
  • generate a safety rating for a bar (possibly using the time too)
    • Our ranking can be on a 1-10 scale:
      • 8-10 the bar is safe. There is low risk of visiting this bar
      • 5-7 There is some risk.
      • 3-4 There is a moderate risk.
      • 1-2 There is high risk.

Potential things to do if we have time

  • add a price to sells in beers and maybe think of a pattern with the price
  • get bar picture
  • get bar review comments [bad and good] maybe write them ourselves
  • remove people who were offended more than 2 times
  • add indexes to tables