DandyHacks2018

Steven Li, Jeffrey Weng, Justin Yau, Jake Zaia

The Orange

Inspiration

The Onion and "fake news" inspired this project. Our project name, The Orange, as well as being a parody of "The Onion", comes from an old English phrase, a Clockwork Orange. A Clockwork Orange appears normal and natural on the outside but is bizzare and mechanical on the inside. Similarly, our website that appears to be news is completely randomly generated by computers, using sources such as The Onion as training data.

What it Does

Using Markov Chains, we train an algorithm on large portions of text. We then use this "trained" data to generate original news stories. Users can also vote and comment on the articles. We store all our information on a DigitalOcean droplet that utilizes mongoDB.

How We Built It

A scraper written in Python uses several APIs and libraries to pull relevent data from different news sources. We then feed that information into our Markov Chain algorithm (also written in python) to help it "learn" how to write natural sounding articles. The website itself is run using Python Flask as a backend with a light Bootstrap & JQuery frontend, and is deployed with Apache. The Digital Ocean droplet stores data internally using MongoDB, while simultaneously hosting the web server.

Challenges

The Scraper was challenging. Different websites use different data storage formats and making a Scraper to extract relevent data consistently from these formats is a difficult task. Working with large data sets was also difficult as our algorithm requires a decent amount of computational power, and can produce files that are very large. Implementing Markov Chains was also an arduous task, as none of the team had any formal experience doing so before.

What we learned

We learned how to utilize markov chains to producing realistic sounding text. We also learned how to use scrapers to get information from the internet algorithmically. Additionally we learned how to deploy a Flask App using apache2 on a Linux Machine.

Whats next

There's still plenty to do such as:

Improving our scrapers to get more meaningful data
Training our algorithms on larger datasets to make better stories
Automatically generating stories on a set schedule
Expanding our website to have more functionality for users

Name		Name	Last commit message	Last commit date
Latest commit History 61 Commits
orange		orange
src		src
theOrangeImages		theOrangeImages
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DandyHacks2018

The Orange

Inspiration

What it Does

How We Built It

Challenges

What we learned

Whats next

About

Releases

Packages

Contributors 3

Languages

License

jzaia18/DandyHacks2018

Folders and files

Latest commit

History

Repository files navigation

DandyHacks2018

The Orange

Inspiration

What it Does

How We Built It

Challenges

What we learned

Whats next

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages