Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NUTCH-2886 Move Nutch WebApp to separate repository #693

Merged
merged 1 commit into from Jul 13, 2021

Conversation

lewismc
Copy link
Member

@lewismc lewismc commented Jul 12, 2021

This PR addresses https://issues.apache.org/jira/browse/NUTCH-2886
N.B. The code is NOT being entirely deleted... it has just been moved to https://github.com/apache/nutch-webapp such that it can be maintained as a separate layer.
This has the benefit of reducing the number of dependencies in ivy/ivy.xml and hence making the core Nutch codebase more secure especially given the low development on the WebApp.
Comments welcome. Thanks

@lewismc
Copy link
Member Author

lewismc commented Jul 12, 2021

This would have the following impact on the nutch source
before restructuring

github.com/AlDanial/cloc v 1.88  T=2.42 s (415.1 files/s, 57736.9 lines/s)
--------------------------------------------------------------------------------
Language                      files          blank        comment           code
--------------------------------------------------------------------------------
Java                            658          12390          26610          60066
XML                             169           2535           3463          14813
HTML                             28           2556            376           3752
XSLT                              9            343            188           3292
Ant                              80            550           1399           1969
JSON                              2             26              0           1733
Markdown                         16            244              0            652
Bourne Shell                     28            147             99            553
Bourne Again Shell                3             78            165            549
XSD                               3             18             50            295
DOS Batch                         1             45              1            194
CSS                               1             18             25            106
DTD                               2             42            138             38
YAML                              2              6             16             35
JavaScript                        1              3              1             20
Dockerfile                        1              6             17             14
--------------------------------------------------------------------------------
SUM:                           1004          19007          32548          88081
--------------------------------------------------------------------------------

after

github.com/AlDanial/cloc v 1.88  T=2.64 s (355.9 files/s, 51112.2 lines/s)
--------------------------------------------------------------------------------
Language                      files          blank        comment           code
--------------------------------------------------------------------------------
Java                            606          11802          25691          57837
XML                             169           2532           3463          14800
XSLT                              9            343            188           3292
HTML                             18           2533            300           3249
Ant                              80            550           1399           1969
JSON                              2             26              0           1733
Markdown                         16            244              0            652
Bourne Shell                     28            147             99            553
Bourne Again Shell                3             78            165            546
XSD                               3             18             50            295
DOS Batch                         1             45              1            194
DTD                               2             42            138             38
YAML                              2              6             16             35
JavaScript                        1              3              1             20
Dockerfile                        1              6             17             14
--------------------------------------------------------------------------------
SUM:                            941          18375          31528          85227
--------------------------------------------------------------------------------

~2750 lines of code contained within the Web Application.

@sebastian-nagel
Copy link
Contributor

+1 looks good, compiles, makes the Nutch job file 18 MB smaller.

@lewismc lewismc merged commit 96b6e0c into apache:master Jul 13, 2021
@lewismc lewismc deleted the NUTCH-2886 branch July 13, 2021 23:19
@lewismc
Copy link
Member Author

lewismc commented Jul 13, 2021

Thanks @sebastian-nagel

sebastian-nagel pushed a commit to sebastian-nagel/nutch that referenced this pull request Sep 22, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants