Skip to content
This repository

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP
branch: master

Dec 11, 2013

  1. add note about dropping Dec 1 2013 results

    git-svn-id: http://httparchive.googlecode.com/svn/trunk@1259 fc7d47d3-c008-acd5-f51f-d19787b8a02f
    stevesoudersorg@gmail.com authored

Dec 09, 2013

  1. derive favicon from logo

    git-svn-id: http://httparchive.googlecode.com/svn/trunk@1258 fc7d47d3-c008-acd5-f51f-d19787b8a02f
    stevesoudersorg@gmail.com authored
  2. restrict use of query.php. increase caching from 1 to 7 days for most…

    … resources.
    
    git-svn-id: http://httparchive.googlecode.com/svn/trunk@1257 fc7d47d3-c008-acd5-f51f-d19787b8a02f
    stevesoudersorg@gmail.com authored

Sep 11, 2013

  1. change Strangeloop to Radware

    git-svn-id: http://httparchive.googlecode.com/svn/trunk@1256 fc7d47d3-c008-acd5-f51f-d19787b8a02f
    stevesoudersorg@gmail.com authored
  2. change Strangeloop to Radware

    git-svn-id: http://httparchive.googlecode.com/svn/trunk@1255 fc7d47d3-c008-acd5-f51f-d19787b8a02f
    stevesoudersorg@gmail.com authored

Aug 26, 2013

  1. bail when there are no pages - this came up when doing a custom crawl…

    … that had no pages in the Top100.
    
    git-svn-id: http://httparchive.googlecode.com/svn/trunk@1254 fc7d47d3-c008-acd5-f51f-d19787b8a02f
    stevesoudersorg@gmail.com authored
  2. temporarily save dumps in BOTH formats (a separate requests table wit…

    …h the naming convention "requests_crawlid").
    
    git-svn-id: http://httparchive.googlecode.com/svn/trunk@1253 fc7d47d3-c008-acd5-f51f-d19787b8a02f
    stevesoudersorg@gmail.com authored
  3. If the list of URLs is specified in a file then do NOT import the URL…

    …s files from Alexa.
    
    git-svn-id: http://httparchive.googlecode.com/svn/trunk@1252 fc7d47d3-c008-acd5-f51f-d19787b8a02f
    stevesoudersorg@gmail.com authored
  4. Not permanent but an initial bit of code to extract results for _cust…

    …om_rules.
    
    git-svn-id: http://httparchive.googlecode.com/svn/trunk@1251 fc7d47d3-c008-acd5-f51f-d19787b8a02f
    stevesoudersorg@gmail.com authored

Aug 15, 2013

  1. set new urlhash field when creating pages during crawl

    git-svn-id: http://httparchive.googlecode.com/svn/trunk@1250 fc7d47d3-c008-acd5-f51f-d19787b8a02f
    stevesoudersorg@gmail.com authored
  2. move $label definition higher

    git-svn-id: http://httparchive.googlecode.com/svn/trunk@1249 fc7d47d3-c008-acd5-f51f-d19787b8a02f
    stevesoudersorg@gmail.com authored

Aug 09, 2013

  1. influx during DB schema transition

    git-svn-id: http://httparchive.googlecode.com/svn/trunk@1248 fc7d47d3-c008-acd5-f51f-d19787b8a02f
    stevesoudersorg@gmail.com authored
  2. save "crawlid" to the pages table

    git-svn-id: http://httparchive.googlecode.com/svn/trunk@1247 fc7d47d3-c008-acd5-f51f-d19787b8a02f
    stevesoudersorg@gmail.com authored
  3. Save "crawlid" to the status table. (Later we should remove "label".)

    git-svn-id: http://httparchive.googlecode.com/svn/trunk@1246 fc7d47d3-c008-acd5-f51f-d19787b8a02f
    stevesoudersorg@gmail.com authored
  4. add function updateCrawlFromId()

    git-svn-id: http://httparchive.googlecode.com/svn/trunk@1245 fc7d47d3-c008-acd5-f51f-d19787b8a02f
    stevesoudersorg@gmail.com authored
  5. save "crawlid" to the status table

    git-svn-id: http://httparchive.googlecode.com/svn/trunk@1244 fc7d47d3-c008-acd5-f51f-d19787b8a02f
    stevesoudersorg@gmail.com authored

Aug 07, 2013

  1. set the new timeAdded field when importing URLs

    git-svn-id: http://httparchive.googlecode.com/svn/trunk@1243 fc7d47d3-c008-acd5-f51f-d19787b8a02f
    stevesoudersorg@gmail.com authored
  2. update new timeAdded field when we add a URL

    git-svn-id: http://httparchive.googlecode.com/svn/trunk@1242 fc7d47d3-c008-acd5-f51f-d19787b8a02f
    stevesoudersorg@gmail.com authored
  3. add log message

    git-svn-id: http://httparchive.googlecode.com/svn/trunk@1241 fc7d47d3-c008-acd5-f51f-d19787b8a02f
    stevesoudersorg@gmail.com authored
  4. add a smaller getUrlhashFunc function

    git-svn-id: http://httparchive.googlecode.com/svn/trunk@1240 fc7d47d3-c008-acd5-f51f-d19787b8a02f
    stevesoudersorg@gmail.com authored
  5. When adding URLs to the "urls" table, make sure to set the urlhash.

    git-svn-id: http://httparchive.googlecode.com/svn/trunk@1239 fc7d47d3-c008-acd5-f51f-d19787b8a02f
    stevesoudersorg@gmail.com authored

Jul 21, 2013

  1. better error handling for no results

    git-svn-id: http://httparchive.googlecode.com/svn/trunk@1238 fc7d47d3-c008-acd5-f51f-d19787b8a02f
    stevesoudersorg@gmail.com authored
  2. better error handling when the requested URL string is not found.

    git-svn-id: http://httparchive.googlecode.com/svn/trunk@1237 fc7d47d3-c008-acd5-f51f-d19787b8a02f
    stevesoudersorg@gmail.com authored
  3. use crawlid fro diffRuns(). Enable flush.

    git-svn-id: http://httparchive.googlecode.com/svn/trunk@1236 fc7d47d3-c008-acd5-f51f-d19787b8a02f
    stevesoudersorg@gmail.com authored
  4. fix bug where we have to decrease the epoch time for latestCrawl().

    git-svn-id: http://httparchive.googlecode.com/svn/trunk@1235 fc7d47d3-c008-acd5-f51f-d19787b8a02f
    stevesoudersorg@gmail.com authored
  5. Make the list of URLs non-wrap.

    git-svn-id: http://httparchive.googlecode.com/svn/trunk@1234 fc7d47d3-c008-acd5-f51f-d19787b8a02f
    stevesoudersorg@gmail.com authored
  6. add getPage() based on crawlid

    git-svn-id: http://httparchive.googlecode.com/svn/trunk@1233 fc7d47d3-c008-acd5-f51f-d19787b8a02f
    stevesoudersorg@gmail.com authored
  7. Improve archiveLabelsForUrl to use urlhash so MUCH faster and more ac…

    …curate (bugs 358 & 359). Make diffRuns used crawlid - much faster (runs.php does infinite scroll almost instantaneously now).
    
    git-svn-id: http://httparchive.googlecode.com/svn/trunk@1232 fc7d47d3-c008-acd5-f51f-d19787b8a02f
    stevesoudersorg@gmail.com authored
  8. add getCrawlFromId(). Add $bCrawlid param to archiveLabels() to retur…

    …n array of crawlids (instead of labels). Add $bMysql and $bCsv to dumpCrawl2() to specify just one desired format (to save time).
    
    git-svn-id: http://httparchive.googlecode.com/svn/trunk@1231 fc7d47d3-c008-acd5-f51f-d19787b8a02f
    stevesoudersorg@gmail.com authored

Jul 19, 2013

  1. If no label is specified, return pageData for the most RECENT test fo…

    …r a URL.
    
    git-svn-id: http://httparchive.googlecode.com/svn/trunk@1230 fc7d47d3-c008-acd5-f51f-d19787b8a02f
    stevesoudersorg@gmail.com authored
  2. The previous change was MUCH faster, but it pulled in URLs that did n…

    …ot have any actual results. (Because we saved all 1M top urls, but we only crawl the top 300K.) This version leverages a faster urlhash search to only pull URLs that have results.
    
    git-svn-id: http://httparchive.googlecode.com/svn/trunk@1229 fc7d47d3-c008-acd5-f51f-d19787b8a02f
    stevesoudersorg@gmail.com authored
  3. make searching for a URL faster - using the "urls" table

    git-svn-id: http://httparchive.googlecode.com/svn/trunk@1228 fc7d47d3-c008-acd5-f51f-d19787b8a02f
    stevesoudersorg@gmail.com authored

Jul 18, 2013

  1. Remove the deprecated autocomplete URL picker.

    git-svn-id: http://httparchive.googlecode.com/svn/trunk@1227 fc7d47d3-c008-acd5-f51f-d19787b8a02f
    stevesoudersorg@gmail.com authored
  2. Add some comments. Use getUrlhashCond().

    git-svn-id: http://httparchive.googlecode.com/svn/trunk@1226 fc7d47d3-c008-acd5-f51f-d19787b8a02f
    stevesoudersorg@gmail.com authored

Jun 24, 2013

  1. add note about changing mobile network to 3G

    git-svn-id: http://httparchive.googlecode.com/svn/trunk@1225 fc7d47d3-c008-acd5-f51f-d19787b8a02f
    stevesoudersorg@gmail.com authored
Something went wrong with that request. Please try again.