Commits on Jul 11, 2016
  1. Remove old partial "story_sentences_dup" index and a maintaining script

    There's now a table-wide "story_sentences_sentence_half_md5" index to replace it.
    Former-commit-id: e5caffddfd3652957fc25ed5f4eb299c33c3ef08
    committed Jul 11, 2016
  2. Merge branch 'master' into story_sentences_fix_deadlock

    Former-commit-id: c9171f6c412c0b13bb40ecaeeac750ff6c6d4367
    committed Jul 11, 2016
  3. Sleep less between chunks because fetchers are pretty fast

    Former-commit-id: 716d270dde7ada0c98689db57c32b58e90907e91
    committed Jul 11, 2016
  4. Remove table "stories_from_failed_bitly_rabbitmq_queue" and accompany…

    …ing script
    Former-commit-id: b89795045dc3333d9dd100c37e80f8007e01e0da
    committed Jul 11, 2016
  5. Merge branch 'bitly_process_schedule_supervisor'

    Former-commit-id: f5a0566ccd0e9f764d235a8c5f37b51630635034
    committed Jul 11, 2016
  6. Remove unused subroutine process_due_schedule_until_finished()

    Former-commit-id: c3722f57769990859f0256885b5654b7336257f4
    committed Jul 11, 2016
  7. Remove unused "use"

    Former-commit-id: e98b23351541667cf693baef774f936f3af7bbb3
    committed Jul 11, 2016
  8. Redo "" into polling script with delays

    Current job broker (RabbitMQ) causes problems when it's queue is too big. Also, when's queue gets too big (contains months worth of stories to fetch stats for), schedule processing script starts duplicating stories in the queue as it assumes that stats must have been fetched at day 3 so now they have to be refetched for day 30.
    So, add fetch jobs in chunks of 1000 each and wait for a minute after each chunk to fill the queue just a little bit more than what fetchers are able to process.
    Former-commit-id: 019a01e3f06d1a48c6c830191380f039729af2c0
    committed Jul 11, 2016
  9. Split schedule processing into subs that do a chunk and all st…

    process_due_schedule_chunk() adds a chunk of stories to job broker's queue; process_due_schedule_until_finished() adds jobs to the queue until there are no more jobs to add.
    Former-commit-id: ca2c6a1590f5834d100dff423d38537ae0713a40
    committed Jul 11, 2016
  10. Add comments about what do the subroutines do

    [ci skip]
    Former-commit-id: 0a220bdd62c301362a07e14889a4f6e42b719b60
    committed Jul 11, 2016
  11. Remove "mediawords_" prefix

    Former-commit-id: 3167ef61ce057696484f83592e3bc5c0813d08d1
    committed Jul 11, 2016
Commits on Jul 10, 2016
  1. change frame to fucs

    Former-commit-id: 8f799012fb908495564598c5feae032aae77cfe3
    hroberts committed Jul 10, 2016
Commits on Jul 8, 2016
  1. init static variables for each run

    Former-commit-id: 1ce44771fcba1bdabe22a1beb28c1cd433ec8d83
    hroberts committed Jul 8, 2016
  2. Add 10k stories to reextraction queue

    Otherwise it starves out sometimes.
    Former-commit-id: 37f750e0224a3cf63f8bdf163a900bb602df0616
    committed Jul 8, 2016
  3. fix merge conflict; move sum_media_inlink_count to medium_link_counts

    Former-commit-id: 23b26a300b24f77a48f39d4432a534fc234755c2
    hroberts committed Jul 8, 2016
  4. Merge branch 'media_inlinks'

    Former-commit-id: 4352041f534d946bfe2878e17ab27d65384dd26d
    hroberts committed Jul 8, 2016
  5. merge sql migration with master

    Former-commit-id: 4a86a14a778cd2a1c59566982142de6903169ea5
    hroberts committed Jul 8, 2016
  6. merge sql migration

    Former-commit-id: 79d16d60a710faac33e43190f99a5bc56d9b75e4
    hroberts committed Jul 8, 2016
  7. Add stories to reextraction queue in random order

    This way doesn't have to wait for advisory lock on media_id.
    Former-commit-id: c55448bb06641dfd6ab93b22aa7cacc2ad3500b6
    committed Jul 8, 2016
  8. Replace EOF with SQL

    Sublime Text 2 then does syntax highlighting on SQL queries.
    Former-commit-id: 702c7ebc11ab0d12504318c6844b5c0297553fc8
    committed Jul 8, 2016
Commits on Jul 7, 2016
  1. Wrap deadlocking sentence deduplication query into advisory locks

    Just to see what will happen.
    Former-commit-id: 1183549cc7c56ea30c101f418bae1a53bbb7a558
    committed Jul 7, 2016
  2. Print story sentence insertion query for debugging purposes

    Former-commit-id: 562139ece6cb5d1ed5ff8f13a5e51a0740c84adc
    committed Jul 7, 2016
  3. Fix subroutine name

    Former-commit-id: a9d909194b68ea8fe7b4a5a515a8110af08b271f
    committed Jul 7, 2016
  4. Merge branch 'recover_failed_bitly_rabbitmq_queue'

    Former-commit-id: c5e94e66ebddcfa365036783b3c4e6a9212a7aaf
    committed Jul 7, 2016
  5. Add "_" suffix to private subroutines, move them to top

    Former-commit-id: e41f2a13309394d57bd2a37098ea5868de11101c
    committed Jul 7, 2016
  6. Add script that reschedules stories for processing from failed…

    … queue
    Former-commit-id: 2179dc683e7a53cb26815dd6c03eae775a1429d0
    committed Jul 7, 2016
  7. add media_inlink metrics

    Former-commit-id: 417851b551898f4ca5fa3d0dce4041ae3ff2c8b3
    hroberts committed Jul 7, 2016
  8. Add table for storing failed RabbitMQ queue

    Former-commit-id: 6048a91bf2fa5fa53243252b81b4daff6756d86d
    committed Jul 7, 2016
  9. Revert "Don't extract / CoreNLP-annotate / old reextra…

    …cted stories"
    This reverts commit e4c02fbb1b6c6b7dc5cead46193b601f5fa23ca1 [formerly acc9362].
    Former-commit-id: c16dd80ce51d83eae47d0b6fbed2260690e03872
    committed Jul 7, 2016
  10. Merge branch 'story_sentences_fix_deadlock'

    Former-commit-id: e984941a0a2c8b9a008aa4d139da6bbce1cce7dc
    committed Jul 7, 2016
  11. Try not setting timezone at all

    Former-commit-id: db41eb1cd80b0ad266831a2c4cbdc963cf4520f3
    committed Jul 7, 2016
  12. Apply PostgreSQL configuration to all config sets found under /etc/po…

    Former-commit-id: ae1459ac054eb08c7650323dcee7802c50ea0cd3
    committed Jul 7, 2016
  13. Add comment about what does each configuration file do

    Former-commit-id: 455a060c51e762dfaf83729952416067a149a454
    committed Jul 7, 2016
  14. Fix quotes

    Former-commit-id: 68940ec54b465556995d5b9b0e7db0b84763deec
    committed Jul 7, 2016
  15. Fail if PostgreSQL is not installed

    Former-commit-id: 0833dacbd8dd6a586f4c88a8effd0c8ef4c78615
    committed Jul 7, 2016