Skip to content

@boogheta boogheta released this Nov 18, 2019 · 44 commits to master since this release

ChangeLog:

  • Settings can now be adjusted when creating a new corpus (#229)
  • Network pages improvements (search nodes #255, view links direction #286, colors #311)
  • Admin page improvements (filter, order, backup and reindex actions, destroy all button... #264)
  • Links from OUT and DISCOVERED entities not taken into account anymore when computing WEs indegree (#232)
  • First version for displaying each WE's ego network (#316 #204)
  • Export buttons for crawls metadata in All crawls page (#319)
  • Fix starting crawls with many many prefixes and startpages previously impossible (#353)
  • Use imported urls as startpages when importing preexisting webentity (#365)
  • Handle case of nested imbricated WebEntities for crawls (#326)
  • Updated list of redirection domains to follow when crawling (#346)
  • Fix missing www for creationrules with a path prefix (#363)
  • Minor frontend improvements (login #239 #279, prospect #323, webentity edit #314 #304, pages network #335, startpages #324, homepage #340, crawls #297, ...)
  • Add API routes to collect crawled pages metadata & html content when option activated

Many thanks to @2LaMa who's behind a lot of these improvements!

Assets 2

@boogheta boogheta released this Sep 13, 2019 · 220 commits to master since this release

ChangeLog:

  • Fix "homepage" mode for automatic startpages (breaking crawls from prospect on some settings)
  • Fix some breaking calls to get_tags with no namespace
  • Fix action menu in List WebEntities sometimes not triggered
  • Better handle errors coming from empty calls (closes #337)
Assets 2

@boogheta boogheta released this Aug 6, 2019 · 228 commits to master since this release

Changelog:

  • Use traph 1.2.0 with paginated queries to fix issues collecting all pages and pagelinks of a single webentity at once (#293), also fasten collecting childentities and cache number of pages by entity during network computation
  • Fix broken WebEntity pages network view
  • Add number of pages per webentity to WebEntities Lists, as well as exports and network view
  • Fix creationrules missing after resetting a corpus (#320)
  • Fix password protected access to corpora
  • Always include homepage as a startpage when crawling a discovered (#322)
  • Fix various crawler errors
  • Allow editing a tag in a single API call instead of removing then adding
  • Add script to trigger backup for all existing corpora
Assets 2

@boogheta boogheta released this Jun 6, 2019 · 271 commits to master since this release

Changelog:

  • Upgraded Scrapy (0.24.6 -> 1.6) and ScrapyD (1.0.1 -> 1.2.0) versions to latest ones, fixing broken crawls on many https websites (#268 #273 #312 #270) and broken Docker installs on some Windows and Mac machines
  • Upgraded Hyphe-Traph (1.0.0 -> 1.1.0) for faster homepages automatic identification
  • Upgraded Graphology (0.11.4 -> 0.14.1) & Sigma (2.0.0-alpha18 -> 2.0.0-alpha20) for small networks fixes
  • Improved Tags Inputs in Frontend's "WebEntity edition" and "Manage Tags" pages
  • Transformed FREETAGS into actual research "Field notes" preparing HyBro's coming new direction (#296)
  • Plenty of minor backend & frontend fixes (#305 #291 #310 #302 #281 #276 #248 #244 #275 #236 #294 #290 #258 ...)
Assets 2

@boogheta boogheta released this Jan 18, 2019 · 323 commits to master since this release

Changelog:

  • Fix docker issue with NFS volumes and alpine dependencies
  • Give to crawler more recent user-agents
  • Use more recent version of sigma.js in frontend's graph visualisation (#285)
  • Add sorting buttons in frontend's crawls list
  • Minor frontend fixes (#280 #277)
Assets 2
Pre-release
  • v1.0.4
  • 7c2e2e7
  • Compare
    Choose a tag to compare
    Search for a tag
Pre-release
  • v1.0.4
  • 7c2e2e7
  • Compare
    Choose a tag to compare
    Search for a tag

@boogheta boogheta released this Jan 18, 2019 · 324 commits to master since this release

Warning: please privilege version 1.0.5

Changelog:

  • Fix docker issue with NFS volumes
  • Give to crawler more recent user-agents
  • Use more recent version of sigma.js in frontend's graph visualisation (#285)
  • Add sorting buttons in frontend's crawls list
  • Minor frontend fixes (#280 #277)
Assets 2

@boogheta boogheta released this Aug 23, 2018 · 350 commits to master since this release

Changelog:

  • Small frontend bugfixes (#259, #262 + tags autocomplete/sorting issues)
  • Add option to setup cookies for some crawls via API advanced use
  • Priorize indexation over webentity links calculation when queue gets too long
Assets 2

@boogheta boogheta released this Apr 26, 2018 · 366 commits to master since this release

Changelog:

  • Updated mongodb calls and added more indexes to fasten pages indexation
  • Changed default configuration from storing html contents to not storing them to lighten disk consumption
  • Small frontend bugfixes (#252, #254, #261)
  • Fixed bin/clone_corpus script
Assets 2

@boogheta boogheta released this Jan 26, 2018 · 395 commits to master since this release

Changelog:

  • Added to Docker frontend config options to restrict Hyphe access behind htpasswd (#249)
  • Added to Docker backend config options to configure DefaultStartpagesMode, CreationRules and RedirectionDomains (#242)
  • Fixed compatibility of exported GEXF graph with Gephi (removed extra id attributes)
Assets 2

@boogheta boogheta released this Jan 15, 2018 · 405 commits to master since this release

After many development versions over the past few years, Hyphe is finally reaching a stable faster version with this version 1.0.0 which includes:

  • an easy installation process relying on Docker for any kind of machine including Linux, Mac OS X & Windows
  • a new homemade memory structure relying on our mix of Tree and Graph structures named hyphe-traph
  • a Material-based redesigned web interface with embedded tagging and a couple other new functionnalities
Assets 2
You can’t perform that action at this time.