Skip to content

Commit

Permalink
Merge branch 'release/v0.1.0'
Browse files Browse the repository at this point in the history
* release/v0.1.0:
  Add release workflow
  Add CI
  typo
  • Loading branch information
MichaelSasser committed Jul 2, 2020
2 parents 0205523 + e3bab48 commit fa1b32e
Show file tree
Hide file tree
Showing 3 changed files with 108 additions and 4 deletions.
42 changes: 42 additions & 0 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
# wordlist-dedup
# Copyright (c) 2020 Michael Sasser <Michael@MichaelSasser.org>
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see <http://www.gnu.org/licenses/>.
#
---

name: Rust

on:
push:
branches: [ develop, master ]
pull_request:
branches: [ develop, master ]

env:
CARGO_TERM_COLOR: always

jobs:
build:

name: Run CI workflow
runs-on: ubuntu-latest

steps:
- name: Checkout code
uses: actions/checkout@v2
- name: Build
run: cargo build --verbose
- name: Run tests
run: cargo test --verbose
62 changes: 62 additions & 0 deletions .github/workflows/release.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,62 @@
# wordlist-dedup
# Copyright (c) 2020 Michael Sasser <Michael@MichaelSasser.org>
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see <http://www.gnu.org/licenses/>.

---
name: Upload Release Asset

on:
push:
tags:
- "v*"

env:
CARGO_TERM_COLOR: always

jobs:
build:
name: Upload Release Asset
runs-on: ubuntu-latest
steps:
- name: Checkout code
uses: actions/checkout@v2

- name: Build Release version
run: cargo build --release --verbose
- name: Run tests
run: cargo test --verbose

- name: Zip project
run: zip --junk-paths wordlist-dedup.zip target/release/wordlist-dedup README.md LICENSE.txt
- name: Create Release
id: create_release
uses: actions/create-release@v1
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
with:
tag_name: ${{ github.ref }}
release_name: Release ${{ github.ref }}
draft: false
prerelease: false
- name: Upload Release Asset
id: upload-release-asset
uses: actions/upload-release-asset@v1
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
with:
upload_url: ${{ steps.create_release.outputs.upload_url }}
asset_path: ./wordlist-dedup.zip
asset_name: wordlist-dedup.zip
asset_content_type: application/zip
8 changes: 4 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,13 +2,13 @@

wordlist-dedup is a program written in rust to deduplicate wordlists. Duh.

I tried to deduplicate lines of a hugh wordlist (>80 GB) with GNU/coreutils
I tried to deduplicate lines of a huge wordlist (>80 GB) with GNU/coreutils
`uniq`. First everything seemed to be hunky dory. Before I deleted the original
file I spotted the size of the deduplicated. It was about half of the original.
In the firsthand I suspected about 5 % duplicates duplicates.

To check this, I wrote a program to count the duplicates and Bingo! The
original file had jist a smidgen over 3 % of duplicates.
original file had just a smidgen over 3 % of duplicates.

Maybe I did something wrong or my PC was not able to handle the memory
consumption of uniq. I don't know, why it needs that much memory and is so
Expand All @@ -25,7 +25,7 @@ wordlist-dedup.

wordlist-dedup as a pure commandline tool. Keep in mind, the file must be
sorted before running it. You can use GNU/coreutils `sort`, which does a fine
job, even, when the RAM is limited. This means, the file cann be larger then
job, even, when the RAM is limited. This means, the file can be larger then
the available RAM. wordlist-dedup does barely use any RAM.
You can use it to deduplicate a file like:

Expand All @@ -48,7 +48,7 @@ other scenarios.

Just run ``cargo build --release``.

The binarywill be stored in ther "target" folder:
The binary will be stored in the "target" folder:
`target/release/wordlist-dedup`.

## Semantic Versioning
Expand Down

0 comments on commit fa1b32e

Please sign in to comment.