Skip to content

Commit

Permalink
Added dependencies
Browse files Browse the repository at this point in the history
  • Loading branch information
hexylena committed Apr 25, 2012
1 parent f490a58 commit f2e87fe
Showing 1 changed file with 5 additions and 0 deletions.
5 changes: 5 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,10 @@ All the scripts are intended to be run from the command line. Aside from enablin

The web stats site Alexa publishes a list of the top one million domains. Download it and convert the CSV to a file with one domain per line with this script.

### Requirements

$ sudo apt-get install php5-curl php5-tiny

### Download a million files

$ php download.php
Expand All @@ -35,6 +39,7 @@ It could trivially be made into a more serious tool by reading a URL list from s

I found only one other algorithm to do it, and it wasn't very good. I just use a few simple heuristics, with a blacklist being the most important. I want to make as few assumptions as possible about what constitutes ASCII art. Hence, I don't try to look for more examples of what I have already seen, but just filter out what I can be sure of is *not* art. This means a more sophisticated approach like a Bayesian filter is not useful in this context.


Your code looks like crap
-------------------------

Expand Down

0 comments on commit f2e87fe

Please sign in to comment.