Releases · archivesunleashed/aut

java.lang.RuntimeException: Unsupported literal type class scala.collection.immutable.Set$Set1 Set(liberal.ca) #529
Improve CommandLineApp.scala test coverage #262
Improve ExtractBoilerpipeText.scala test coverage #261
Improve ArchiveRecord.scala test coverage #260
Unit testing for RecordLoader #182
Improve ArchiveRecordWritable.java test coverage #76
Improve WarcRecordUtils.java test coverage #74
Improve ArcRecordUtils.java test coverage #73
Improve ExtractDate.scala test coverage #64
Remove org.apache.commons.httpclient #23

Merged pull requests:

Make webpages() consistent across aut and ARCH. #539 (ruebot)
Update README #537 (ruebot)
Fix codecov GitHub action. #536 (ruebot)
Bump commons-compress from 1.14 to 1.21 #535 (dependabot[bot])
Remove Java w/arc processing, and replace it with Sparkling. #533 (ruebot)
Bump xercesImpl from 2.12.0 to 2.12.2 #527 (dependabot[bot])

Assets 26

21 Jan 15:03

ruebot

aut-0.91.0

2d03904

aut-0.91.0

Documentation

Release Notes

Full Changelog

Implemented enhancements:

Include timestamp in crawl date #525

Merged pull requests:

Change crawl_date format to YYYYMMDDHHMMSS, update hasDate filter. #526 (ruebot)

Assets 30

01 Nov 16:36

ruebot

aut-0.90.4

145354c

aut-0.90.4

Documentation

Release Notes

Full Changelog

Implemented enhancements:

Replace scala-uri library from ExtractDomain and just parse public_suffix_list.dat #521

Fixed bugs:

Scaladocs haven't been created since 0.90.0 release #522

Merged pull requests:

Replace scala-uri library from ExtractDomain. #524 (ruebot)
Issue 522 #523 (ruebot)

Assets 30

22 Oct 14:09

ruebot

aut-0.90.3

2df52a5

aut-0.90.3

Documentation

Release Notes

Full Changelog

Fixed bugs:

ExtractDomains returns non-Apex Domains #519

Merged pull requests:

Update ExtractDomain to extract apex domains. #520 (ruebot)
Bump jsoup from 1.13.1 to 1.14.2 #518 (dependabot[bot])

Assets 30

12 May 16:09

ruebot

aut-0.90.2

2af038d

aut-0.90.2

Documentation

Release Notes

Full Changelog

Fixed bugs:

ARC file name appearing in url list #516
WARC-Target-URI in Wget warc files is not parsed properly #514

Merged pull requests:

Filter or filedesc and dns records from arcs. #517 (ruebot)
Handle wget WARC-Target-URI formatting. #515 (ruebot)

Assets 30

29 Apr 18:14

ruebot

aut-0.90.1

f185d91

aut-0.90.1

Documentation

Release Notes

Full Changelog

Fixed bugs:

crawl_date is not included on binary information jobs when documentation says it is #512

Merged pull requests:

Add missing crawl_date column to binary information jobs. #513 (ruebot)
Update jsoup to 1.13.1 #511 (ruebot)

Assets 30

27 Jan 15:21

ruebot

aut-0.90.0

c0872c7

aut-0.90.0

Documentation

Release Notes

Full Changelog

Fixed bugs:

Python implementation of .all() has .keepValidPages() incorrectly applied to it #502
Extract hyperlinks from wayback machine #501
Release 0.80.0 JAR produces error; built 0.80.1 fatjar built on repo works #495

Closed issues:

Migrate CI infrastructure from TravisCI to GitHub Action #506
Split tf into it's own repo #498
Change master branch to main branch #490
GitHub action - Run isort and black on Python code #488
Add scalafmt GitHub action #486
Add Google Java Formatter as a GitHub action #484
Packages build is often broken - should we support it? #483
Implement SaveToDisk in Python #478
Java 11 support #356

Merged pull requests:

ars-cloud compatibility with aut and Java 11 #510 (ruebot)
Update to Spark 3.0.1 #508 (ruebot)
Replace TravisCI with GitHub Actions. #507 (ruebot)
Bump junit from 4.12 to 4.13.1 #505 (dependabot[bot])
Fix relative links extraction #504 (yxzhu16)
Remove .keepValidPages() on .all() Python implmentation. #503 (ruebot)
Updates read.me to include citation section #500 (SamFritz)
Remove tf project; resolves #498. #499 (ruebot)
Add Python formatter GitHub Action. #489 (ruebot)
Add scalafmt GitHub action and apply it to scala code. #487 (ruebot)
Add Google Java Formatter as an action, and apply it. #485 (ruebot)
Add Python implementation of SaveBytes. #482 (ruebot)
Bump xercesImpl from 2.11.0 to 2.12.0 #481 (dependabot[bot])
[Skip Travis] Trim README down given aut.docs.archivesunleashed.org #480 (ruebot)
Spark 3.0.0 + Java 11 support. #375 (ruebot)

Assets 30

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Documentation

Release Notes

Documentation

Release Notes

Documentation

Release Notes

Documentation

Release Notes

Documentation

Release Notes

Documentation

Release Notes

Documentation

Release Notes

Documentation

Release Notes

Documentation

Release Notes

Documentation

Release Notes

Releases: archivesunleashed/aut

aut-1.2.0

Documentation

Release Notes

aut-1.1.1

Documentation

Release Notes

aut-1.1.0

Documentation

Release Notes

aut-1.0.0

Documentation

Release Notes

aut-0.91.0

Documentation

Release Notes

aut-0.90.4

Documentation

Release Notes

aut-0.90.3

Documentation

Release Notes

aut-0.90.2

Documentation

Release Notes

aut-0.90.1

Documentation

Release Notes

aut-0.90.0

Documentation

Release Notes