We build a model about whether or not a particular domain carries pornographic content using a short list of keywords and a list of domain level suffixes. To build the model, we use data from Shallalist, which maintains a database of category of content hosted by a domain. Details about the method are outlined in Where's the Porn? Classifying Porn Domains Using a Calibrated Keyword Classifier.
The classifier using the following shallalist data, list of keywords and domain suffixes achieves an accuracy of nearly 80%.