-
Notifications
You must be signed in to change notification settings - Fork 256
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Create DeletionBolt.java for Solr. #1050 #1073
Conversation
storm-crawler-solr bug. Missing DeletionBolt bolt code. apache#1050
License header added
external/solr/src/main/java/com/digitalpebble/stormcrawler/solr/bolt/DeletionBolt.java
Outdated
Show resolved
Hide resolved
🛠 Lift Auto-fixSome of the Lift findings in this PR can be automatically fixed. You can download and apply these changes in your local project directory of your branch to review the suggestions before committing.1 # Download the patch
curl https://lift.sonatype.com/api/patch/github.com/DigitalPebble/storm-crawler/1073.diff -o lift-autofixes.diff
# Apply the patch with git
git apply lift-autofixes.diff
# Review the changes
git diff Want it all in a single command? Open a terminal in your project's directory and copy and paste the following command: curl https://lift.sonatype.com/api/patch/github.com/DigitalPebble/storm-crawler/1073.diff | git apply Once you're satisfied, commit and push your changes in your project. Footnotes |
formatting
Could you please add the bolt to the SolrCrawlTopology so that people can see how to connect it to the other components? |
import org.slf4j.LoggerFactory; | ||
|
||
public class DeletionBolt extends BaseRichBolt { | ||
/** */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove empty comment and serialversion
private SolrConnection connection; | ||
|
||
public DeletionBolt() { | ||
/* empty */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove comment
import org.slf4j.Logger; | ||
import org.slf4j.LoggerFactory; | ||
|
||
public class DeletionBolt extends BaseRichBolt { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe add a comment explaining how it should be connected to the status updater bolt
thanks @syefimov |
* Create DeletionBolt.java storm-crawler-solr bug. Missing DeletionBolt bolt code. apache#1050 * Update DeletionBolt.java License header added * Update DeletionBolt.java formatting Signed-off-by: Michael Dinzinger <michael.dinzinger@uni-passau.de>
* Remove injection from crawl topologies in *Search archetypes, fixes #1065 Signed-off-by: Julien Nioche <julien@digitalpebble.com> Signed-off-by: Michael Dinzinger <michael.dinzinger@uni-passau.de> * BasicURLNormalizer .unmangleQueryString() returns invalid results if "&" symbol in a parents path #1059 (#1062) * Fix unmangleQueryString filter. Fix unmangleQueryString filter. Do not analyze full URL path, just last child, * formatting Signed-off-by: Michael Dinzinger <michael.dinzinger@uni-passau.de> * Removed remaining references to ES in OPenSearch module Signed-off-by: Julien Nioche <julien@digitalpebble.com> Signed-off-by: Michael Dinzinger <michael.dinzinger@uni-passau.de> * Dependency upgrades.fixes #1066 (#1067) Signed-off-by: Julien Nioche <julien@digitalpebble.com> Signed-off-by: Michael Dinzinger <michael.dinzinger@uni-passau.de> * Automatic creation of index definitions should use the bolt type (#1069) Signed-off-by: Julien Nioche <julien@digitalpebble.com> Signed-off-by: Michael Dinzinger <michael.dinzinger@uni-passau.de> * Maven plugin upgrades + better handling of plugin versions Signed-off-by: Julien Nioche <julien@digitalpebble.com> Signed-off-by: Michael Dinzinger <michael.dinzinger@uni-passau.de> * bgufix test jar not attached Signed-off-by: Julien Nioche <julien@digitalpebble.com> Signed-off-by: Michael Dinzinger <michael.dinzinger@uni-passau.de> * Update maven.yml v3 version of actions Signed-off-by: Michael Dinzinger <michael.dinzinger@uni-passau.de> * mechanism to retrieve more generic value of configuration (#1071) * mechanism to retrieve more generic value of configuration if a specific one is not found, fixes #1070 Signed-off-by: Julien Nioche <julien@digitalpebble.com> * minor javadoc fix Signed-off-by: Julien Nioche <julien@digitalpebble.com> --------- Signed-off-by: Julien Nioche <julien@digitalpebble.com> Signed-off-by: Michael Dinzinger <michael.dinzinger@uni-passau.de> * Batch requests in DeleterBolt, fixes #1072 Signed-off-by: Julien Nioche <julien@digitalpebble.com> Signed-off-by: Michael Dinzinger <michael.dinzinger@uni-passau.de> * Update README.md link to docker project Signed-off-by: Michael Dinzinger <michael.dinzinger@uni-passau.de> * Create DeletionBolt.java for Solr. #1050 (#1073) * Create DeletionBolt.java storm-crawler-solr bug. Missing DeletionBolt bolt code. #1050 * Update DeletionBolt.java License header added * Update DeletionBolt.java formatting Signed-off-by: Michael Dinzinger <michael.dinzinger@uni-passau.de> * SOLR: suppress warnings + minor changes and Javadoc + added deletion to default topology Signed-off-by: Julien Nioche <julien@digitalpebble.com> Signed-off-by: Michael Dinzinger <michael.dinzinger@uni-passau.de> * Tika 2.8.0, fixes 1066 Signed-off-by: Julien Nioche <julien@digitalpebble.com> Signed-off-by: Michael Dinzinger <michael.dinzinger@uni-passau.de> * Increase the number of redirects to 5 for Robots.txt fetching (#1074) * Issue #1058: Allow 5 redirects for Robots.txt fetching Signed-off-by: Michael Dinzinger <michael.dinzinger@uni-passau.de> * Minor variable renaming Signed-off-by: Michael Dinzinger <michael.dinzinger@uni-passau.de> --------- Signed-off-by: Michael Dinzinger <michael.dinzinger@uni-passau.de> * Add test coverage reports with JaCoCo and Coveralls, fixes #1075 Signed-off-by: Julien Nioche <julien@digitalpebble.com> Signed-off-by: Michael Dinzinger <michael.dinzinger@uni-passau.de> * #1075 - Add test coverage reports with JaCoCo Signed-off-by: Richard Zowalla <richard.zowalla@hs-heilbronn.de> Signed-off-by: Michael Dinzinger <michael.dinzinger@uni-passau.de> * #1075 - Update GH workflow to reduce log spam by adding -B and --no-transfer-progess maven options Signed-off-by: Richard Zowalla <richard.zowalla@hs-heilbronn.de> Signed-off-by: Michael Dinzinger <michael.dinzinger@uni-passau.de> * Rebase - Issue #1042: Forbid all rules by default Signed-off-by: Michael Dinzinger <michael.dinzinger@uni-passau.de> * Modify Robots.txt parsing logic and add test cases Signed-off-by: Michael Dinzinger <michael.dinzinger@uni-passau.de> * Parse robots txt rules only for status code 200 Signed-off-by: Michael Dinzinger <michael.dinzinger@uni-passau.de> * Trying to resolve merge conflicts Signed-off-by: Michael Dinzinger <michael.dinzinger@uni-passau.de> * Modify Robots.txt parsing logic and add test cases Signed-off-by: Michael Dinzinger <michael.dinzinger@uni-passau.de> * Parse robots txt rules only for status code 200 Signed-off-by: Michael Dinzinger <michael.dinzinger@uni-passau.de> * Merge HttpRobotRulesParserTest Signed-off-by: Michael Dinzinger <michael.dinzinger@uni-passau.de> --------- Signed-off-by: Julien Nioche <julien@digitalpebble.com> Signed-off-by: Michael Dinzinger <michael.dinzinger@uni-passau.de> Signed-off-by: Richard Zowalla <richard.zowalla@hs-heilbronn.de> Co-authored-by: Julien Nioche <julien@digitalpebble.com> Co-authored-by: syefimov <syefimov@ptfs.com> Co-authored-by: Richard Zowalla <richard.zowalla@hs-heilbronn.de>
storm-crawler-solr bug. Missing DeletionBolt bolt code. #1050