Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can't download XLSX File #824

Closed
parvjain639 opened this issue Jul 27, 2023 · 17 comments
Closed

Can't download XLSX File #824

parvjain639 opened this issue Jul 27, 2023 · 17 comments

Comments

@parvjain639
Copy link

For many projects, I am unable to Download XLSX File Format Report.
While Downloading it shows (SERVER ERROR 500)

Screenshot 2023-07-27 111707

@tdruez
Copy link
Member

tdruez commented Jul 27, 2023

Hi @parvjain639, could you provide some context in order to reproduce this issue?

  • Which version of ScanCode.io are you running?
  • Using Docker?
  • On Which OS?
  • What inputs are you using?
  • Which pipeline are you running?

@parvjain639
Copy link
Author

  1. V32.4.0
  2. Yes
  3. Debian 11.6
  4. link 1. https://hub.docker.com/_/python
    link 2. https://hub.docker.com/r/ioexpert/netpi-openplc
  5. Docker, Find_Vulnerabilities And Scan_Codebase

@tdruez
Copy link
Member

tdruez commented Jul 27, 2023

@parvjain639 can you try the other download formats and let me know if you also have a 500 for those?

@parvjain639
Copy link
Author

Yes, I have tried downloading output in other format. As i am able to Download easily.
Error shows only when i try to download in XLSX Format!

@tdruez
Copy link
Member

tdruez commented Jul 27, 2023

Ok, thanks for the confirmation.
I cannot reproduce so far by running docker + find_vulnerabilities pipelines on the docker://python input.

Could you look into the web container log?

docker compose logs --tail="200" web

Is there anything related to the issue?

@parvjain639
Copy link
Author

Command: docker compose logs --tail="200" web

Result:
web_1 | ^^^^^^^^^^^^^^^^^^^
web_1 | File "/usr/local/lib/python3.11/site-packages/django/views/generic/ba se.py", line 104, in view
web_1 | return self.dispatch(request, *args, **kwargs)
web_1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
web_1 | File "/usr/local/lib/python3.11/site-packages/django/contrib/auth/mix ins.py", line 135, in dispatch
web_1 | return super().dispatch(request, *args, **kwargs)
web_1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
web_1 | File "/usr/local/lib/python3.11/site-packages/django/views/generic/ba se.py", line 143, in dispatch
web_1 | return handler(request, *args, **kwargs)
web_1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
web_1 | File "/app/scanpipe/views.py", line 988, in get
web_1 | output_file = output.to_xlsx(project)
web_1 | ^^^^^^^^^^^^^^^^^^^^^^^
web_1 | File "/app/scanpipe/pipes/output.py", line 471, in to_xlsx
web_1 | if layers_data := docker.get_layers_data(project):
web_1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
web_1 | File "/app/scanpipe/pipes/docker.py", line 291, in get_layers_data
web_1 | image_id = image.get("image_id")
web_1 | ^^^^^^^^^
web_1 | AttributeError: 'str' object has no attribute 'get'
web_1 | ERROR Internal Server Error: /project/pythonlatest-c46201cd/results/xls x/
web_1 | Traceback (most recent call last):
web_1 | File "/usr/local/lib/python3.11/site-packages/django/core/handlers/ex ception.py", line 55, in inner
web_1 | response = get_response(request)
web_1 | ^^^^^^^^^^^^^^^^^^^^^
web_1 | File "/usr/local/lib/python3.11/site-packages/django/core/handlers/ba se.py", line 197, in _get_response
web_1 | response = wrapped_callback(request, *callback_args, **callback_kwa rgs)
web_1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ^^^^
web_1 | File "/usr/local/lib/python3.11/contextlib.py", line 81, in inner
web_1 | return func(*args, **kwds)
web_1 | ^^^^^^^^^^^^^^^^^^^
web_1 | File "/usr/local/lib/python3.11/site-packages/django/views/generic/ba se.py", line 104, in view
web_1 | return self.dispatch(request, *args, **kwargs)
web_1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
web_1 | File "/usr/local/lib/python3.11/site-packages/django/contrib/auth/mix ins.py", line 135, in dispatch
web_1 | return super().dispatch(request, *args, **kwargs)
web_1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
web_1 | File "/usr/local/lib/python3.11/site-packages/django/views/generic/ba se.py", line 143, in dispatch
web_1 | return handler(request, *args, **kwargs)
web_1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
web_1 | File "/app/scanpipe/views.py", line 988, in get
web_1 | output_file = output.to_xlsx(project)
web_1 | ^^^^^^^^^^^^^^^^^^^^^^^
web_1 | File "/app/scanpipe/pipes/output.py", line 471, in to_xlsx
web_1 | if layers_data := docker.get_layers_data(project):
web_1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
web_1 | File "/app/scanpipe/pipes/docker.py", line 291, in get_layers_data
web_1 | image_id = image.get("image_id")
web_1 | ^^^^^^^^^
web_1 | AttributeError: 'str' object has no attribute 'get'
web_1 | ERROR Internal Server Error: /project/pythonlatest-c46201cd/results/xls x/
web_1 | Traceback (most recent call last):
web_1 | File "/usr/local/lib/python3.11/site-packages/django/core/handlers/ex ception.py", line 55, in inner
web_1 | response = get_response(request)
web_1 | ^^^^^^^^^^^^^^^^^^^^^
web_1 | File "/usr/local/lib/python3.11/site-packages/django/core/handlers/ba se.py", line 197, in _get_response
web_1 | response = wrapped_callback(request, *callback_args, **callback_kwa rgs)
web_1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ^^^^
web_1 | File "/usr/local/lib/python3.11/contextlib.py", line 81, in inner
web_1 | return func(*args, **kwds)
web_1 | ^^^^^^^^^^^^^^^^^^^
web_1 | File "/usr/local/lib/python3.11/site-packages/django/views/generic/ba se.py", line 104, in view
web_1 | return self.dispatch(request, *args, **kwargs)
web_1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
web_1 | File "/usr/local/lib/python3.11/site-packages/django/contrib/auth/mix ins.py", line 135, in dispatch
web_1 | return super().dispatch(request, *args, **kwargs)
web_1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
web_1 | File "/usr/local/lib/python3.11/site-packages/django/views/generic/ba se.py", line 143, in dispatch
web_1 | return handler(request, *args, **kwargs)
web_1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
web_1 | File "/app/scanpipe/views.py", line 988, in get
web_1 | output_file = output.to_xlsx(project)
web_1 | ^^^^^^^^^^^^^^^^^^^^^^^
web_1 | File "/app/scanpipe/pipes/output.py", line 471, in to_xlsx
web_1 | if layers_data := docker.get_layers_data(project):
web_1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
web_1 | File "/app/scanpipe/pipes/docker.py", line 291, in get_layers_data
web_1 | image_id = image.get("image_id")
web_1 | ^^^^^^^^^
web_1 | AttributeError: 'str' object has no attribute 'get'
web_1 | ERROR Internal Server Error: /project/pythonlatest-c46201cd/results/xls x/
web_1 | Traceback (most recent call last):
web_1 | File "/usr/local/lib/python3.11/site-packages/django/core/handlers/ex ception.py", line 55, in inner
web_1 | response = get_response(request)
web_1 | ^^^^^^^^^^^^^^^^^^^^^
web_1 | File "/usr/local/lib/python3.11/site-packages/django/core/handlers/ba se.py", line 197, in _get_response
web_1 | response = wrapped_callback(request, *callback_args, **callback_kwa rgs)
web_1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ^^^^
web_1 | File "/usr/local/lib/python3.11/contextlib.py", line 81, in inner
web_1 | return func(*args, **kwds)
web_1 | ^^^^^^^^^^^^^^^^^^^
web_1 | File "/usr/local/lib/python3.11/site-packages/django/views/generic/ba se.py", line 104, in view
web_1 | return self.dispatch(request, *args, **kwargs)
web_1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
web_1 | File "/usr/local/lib/python3.11/site-packages/django/contrib/auth/mix ins.py", line 135, in dispatch
web_1 | return super().dispatch(request, *args, **kwargs)
web_1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
web_1 | File "/usr/local/lib/python3.11/site-packages/django/views/generic/ba se.py", line 143, in dispatch
web_1 | return handler(request, *args, **kwargs)
web_1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
web_1 | File "/app/scanpipe/views.py", line 988, in get
web_1 | output_file = output.to_xlsx(project)
web_1 | ^^^^^^^^^^^^^^^^^^^^^^^
web_1 | File "/app/scanpipe/pipes/output.py", line 471, in to_xlsx
web_1 | if layers_data := docker.get_layers_data(project):
web_1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
web_1 | File "/app/scanpipe/pipes/docker.py", line 291, in get_layers_data
web_1 | image_id = image.get("image_id")
web_1 | ^^^^^^^^^
web_1 | AttributeError: 'str' object has no attribute 'get'
web_1 | ERROR Internal Server Error: /project/pythonlatest-c46201cd/results/xls x/
web_1 | Traceback (most recent call last):
web_1 | File "/usr/local/lib/python3.11/site-packages/django/core/handlers/ex ception.py", line 55, in inner
web_1 | response = get_response(request)
web_1 | ^^^^^^^^^^^^^^^^^^^^^
web_1 | File "/usr/local/lib/python3.11/site-packages/django/core/handlers/ba se.py", line 197, in _get_response
web_1 | response = wrapped_callback(request, *callback_args, **callback_kwa rgs)
web_1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ^^^^
web_1 | File "/usr/local/lib/python3.11/contextlib.py", line 81, in inner
web_1 | return func(*args, **kwds)
web_1 | ^^^^^^^^^^^^^^^^^^^
web_1 | File "/usr/local/lib/python3.11/site-packages/django/views/generic/ba se.py", line 104, in view
web_1 | return self.dispatch(request, *args, **kwargs)
web_1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
web_1 | File "/usr/local/lib/python3.11/site-packages/django/contrib/auth/mix ins.py", line 135, in dispatch
web_1 | return super().dispatch(request, *args, **kwargs)
web_1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
web_1 | File "/usr/local/lib/python3.11/site-packages/django/views/generic/ba se.py", line 143, in dispatch
web_1 | return handler(request, *args, **kwargs)
web_1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
web_1 | File "/app/scanpipe/views.py", line 988, in get
web_1 | output_file = output.to_xlsx(project)
web_1 | ^^^^^^^^^^^^^^^^^^^^^^^
web_1 | File "/app/scanpipe/pipes/output.py", line 471, in to_xlsx
web_1 | if layers_data := docker.get_layers_data(project):
web_1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
web_1 | File "/app/scanpipe/pipes/docker.py", line 291, in get_layers_data
web_1 | image_id = image.get("image_id")
web_1 | ^^^^^^^^^
web_1 | AttributeError: 'str' object has no attribute 'get'
web_1 | ERROR Internal Server Error: /project/pythonlatest-c46201cd/results/xls x/
web_1 | Traceback (most recent call last):
web_1 | File "/usr/local/lib/python3.11/site-packages/django/core/handlers/ex ception.py", line 55, in inner
web_1 | response = get_response(request)
web_1 | ^^^^^^^^^^^^^^^^^^^^^
web_1 | File "/usr/local/lib/python3.11/site-packages/django/core/handlers/ba se.py", line 197, in _get_response
web_1 | response = wrapped_callback(request, *callback_args, **callback_kwa rgs)
web_1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ^^^^
web_1 | File "/usr/local/lib/python3.11/contextlib.py", line 81, in inner
web_1 | return func(*args, **kwds)
web_1 | ^^^^^^^^^^^^^^^^^^^
web_1 | File "/usr/local/lib/python3.11/site-packages/django/views/generic/ba se.py", line 104, in view
web_1 | return self.dispatch(request, *args, **kwargs)
web_1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
web_1 | File "/usr/local/lib/python3.11/site-packages/django/contrib/auth/mix ins.py", line 135, in dispatch
web_1 | return super().dispatch(request, *args, **kwargs)
web_1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
web_1 | File "/usr/local/lib/python3.11/site-packages/django/views/generic/ba se.py", line 143, in dispatch
web_1 | return handler(request, *args, **kwargs)
web_1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
web_1 | File "/app/scanpipe/views.py", line 988, in get
web_1 | output_file = output.to_xlsx(project)
web_1 | ^^^^^^^^^^^^^^^^^^^^^^^
web_1 | File "/app/scanpipe/pipes/output.py", line 471, in to_xlsx
web_1 | if layers_data := docker.get_layers_data(project):
web_1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
web_1 | File "/app/scanpipe/pipes/docker.py", line 291, in get_layers_data
web_1 | image_id = image.get("image_id")
web_1 | ^^^^^^^^^
web_1 | AttributeError: 'str' object has no attribute 'get'
web_1 | ERROR Internal Server Error: /project/redis-ff00d238/results/xlsx/
web_1 | Traceback (most recent call last):
web_1 | File "/usr/local/lib/python3.11/site-packages/django/core/handlers/ex ception.py", line 55, in inner
web_1 | response = get_response(request)
web_1 | ^^^^^^^^^^^^^^^^^^^^^
web_1 | File "/usr/local/lib/python3.11/site-packages/django/core/handlers/ba se.py", line 197, in _get_response
web_1 | response = wrapped_callback(request, *callback_args, **callback_kwa rgs)
web_1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ^^^^
web_1 | File "/usr/local/lib/python3.11/contextlib.py", line 81, in inner
web_1 | return func(*args, **kwds)
web_1 | ^^^^^^^^^^^^^^^^^^^
web_1 | File "/usr/local/lib/python3.11/site-packages/django/views/generic/ba se.py", line 104, in view
web_1 | return self.dispatch(request, *args, **kwargs)
web_1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
web_1 | File "/usr/local/lib/python3.11/site-packages/django/contrib/auth/mix ins.py", line 135, in dispatch
web_1 | return super().dispatch(request, *args, **kwargs)
web_1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
web_1 | File "/usr/local/lib/python3.11/site-packages/django/views/generic/ba se.py", line 143, in dispatch
web_1 | return handler(request, *args, **kwargs)
web_1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
web_1 | File "/app/scanpipe/views.py", line 988, in get
web_1 | output_file = output.to_xlsx(project)
web_1 | ^^^^^^^^^^^^^^^^^^^^^^^
web_1 | File "/app/scanpipe/pipes/output.py", line 471, in to_xlsx
web_1 | if layers_data := docker.get_layers_data(project):
web_1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
web_1 | File "/app/scanpipe/pipes/docker.py", line 291, in get_layers_data
web_1 | image_id = image.get("image_id")
web_1 | ^^^^^^^^^
web_1 | AttributeError: 'str' object has no attribute 'get'

Here is the list. Output received from the command.

@tdruez
Copy link
Member

tdruez commented Jul 27, 2023

Thanks, that's helpful!

link 1. https://hub.docker.com/_/python

Can you clarify the exact input you provided to the Project?
Was it a docker type URL or an upload of an image downloaded from docker.com?

@parvjain639
Copy link
Author

we have used Docker type url i.e.
docker://python:latest

@tdruez
Copy link
Member

tdruez commented Jul 28, 2023

@parvjain639 Ok, did you make any particular configuration changes or are you running the default?

web_1 | AttributeError: 'str' object has no attribute 'get'

For some reasons the content of project.extra_data appears to be a string instead of the expected json/dict. Let's try to look at the data:

Could you access a Django shell using:

docker compose run web ./manage.py shell

Once in the shell:

from scanpipe.models import Project
project = Project.objects.get(slug="redis-ff00d238")
print(project.extra_data)
print(type(project.extra_data))
print(type(project.extra_data.get("images")))

Please provide the output of those commands, we should get a better idea about the shape of the data.

@parvjain639
Copy link
Author

We are using Default Configuration files. as no changes were made.

Screenshot 2023-07-28 102419

Could you access a Django shell using:
i am not getting clear results from this!

@tdruez
Copy link
Member

tdruez commented Jul 28, 2023

Thanks, it seems that one of the pipeline steps may have an impact on the extra_data structure.

Could you paste the output of:

docker compose run web ./manage.py status --project PROJECT_NAME

In your case I'm guessing the PROJECT_NAME is redis.

@parvjain639
Copy link
Author

Here is the Output of the Command!

root@debian:~/projects/scancode.io# docker compose run web ./manage.py status --project Python.Latest
[+] Building 0.0s (0/0)
[+] Creating 1/0
✔ Container scancodeio-db-1 Running 0.0s
[+] Building 0.0s (0/0)
Project: Python.Latest
Create date: Jul 19 2023 04:51
Work directory: /var/scancodeio/workspace/projects/pythonlatest-c46201cd

Database:

  • CodebaseResource: 99558
    • (no status): 35
    • application-package: 3592
    • ignored-empty-file: 937
    • ignored-not-interesting: 271
    • ignored-whiteout: 624
    • no-licenses: 3550
    • scanned: 67688
    • scanned-with-error: 6
    • symlink: 2640
    • system-package: 20208
    • unknown-license: 7
  • DiscoveredPackage: 477
  • ProjectError: 10

Inputs:

  • python_latest.tar (docker://python:latest)

Pipelines:
[FAILURE] deploy_to_develop
2023-07-19 04:51:56.84 Pipeline [deploy_to_develop] starting
2023-07-19 04:51:57.19 Step [get_inputs] starting
2023-07-19 04:51:57.20 Pipeline failed
[SUCCESS] docker (executed in 4326 seconds)
2023-07-19 04:52:12.03 Pipeline [docker] starting
2023-07-19 04:52:12.29 Step [extract_images] starting
2023-07-19 04:52:24.87 Step [extract_images] completed in 13 seconds
2023-07-19 04:52:26.75 Step [extract_layers] starting
2023-07-19 04:52:47.08 Step [extract_layers] completed in 19 seconds
2023-07-19 04:52:47.10 Step [find_images_os_and_distro] starting
2023-07-19 04:52:47.10 Step [find_images_os_and_distro] completed in 0 seconds
2023-07-19 04:52:47.11 Step [collect_images_information] starting
2023-07-19 04:52:47.13 Step [collect_images_information] completed in 0 seconds
2023-07-19 04:52:47.13 Step [collect_and_create_codebase_resources] starting
2023-07-19 05:01:27.64 Step [collect_and_create_codebase_resources] completed in 520 seconds (8.7 minutes)
2023-07-19 05:01:27.65 Step [collect_and_create_system_packages] starting
2023-07-19 05:47:41.41 Step [collect_and_create_system_packages] completed in 2774 seconds (46.2 minutes)
2023-07-19 05:47:41.43 Step [flag_uninteresting_codebase_resources] starting
2023-07-19 05:47:41.85 Step [flag_uninteresting_codebase_resources] completed in 0 seconds
2023-07-19 05:47:41.86 Step [flag_empty_files] starting
2023-07-19 05:47:41.91 Step [flag_empty_files] completed in 0 seconds
2023-07-19 05:47:41.93 Step [flag_ignored_resources] starting
2023-07-19 05:47:41.93 Step [flag_ignored_resources] completed in 0 seconds
2023-07-19 05:47:41.94 Step [scan_for_application_packages] starting
2023-07-19 05:50:59.59 Step [scan_for_application_packages] completed in 198 seconds (3.3 minutes)
2023-07-19 05:50:59.60 Step [scan_for_files] starting
2023-07-19 06:04:16.61 Step [scan_for_files] completed in 797 seconds (13.3 minutes)
2023-07-19 06:04:16.62 Step [analyze_scanned_files] starting
2023-07-19 06:04:18.02 Step [analyze_scanned_files] completed in 1 seconds
2023-07-19 06:04:18.03 Step [flag_not_analyzed_codebase_resources] starting
2023-07-19 06:04:18.04 Step [flag_not_analyzed_codebase_resources] completed in 0 seconds
2023-07-19 06:04:18.05 Pipeline completed
[SUCCESS] find_vulnerabilities (executed in 795 seconds)
2023-07-19 06:06:25.27 Pipeline [find_vulnerabilities] starting
2023-07-19 06:06:25.29 Step [check_vulnerablecode_service_availability] starting
2023-07-19 06:06:34.62 Step [check_vulnerablecode_service_availability] completed in 9 seconds
2023-07-19 06:06:34.64 Step [lookup_vulnerabilities] starting
2023-07-19 06:19:40.82 Step [lookup_vulnerabilities] completed in 786 seconds (13.1 minutes)
2023-07-19 06:19:40.84 Pipeline completed
[FAILURE] inspect_manifest
2023-07-19 06:23:04.19 Pipeline [inspect_manifest] starting
2023-07-19 06:23:04.20 Step [get_manifest_inputs] starting
2023-07-19 06:23:04.21 Step [get_manifest_inputs] completed in 0 seconds
2023-07-19 06:23:04.22 Step [get_packages_from_manifest] starting
2023-07-19 06:23:04.26 Pipeline failed
[SUCCESS] load_inventory
2023-07-19 06:23:35.16 Pipeline [load_inventory] starting
2023-07-19 06:23:35.18 Step [get_inputs] starting
2023-07-19 06:23:35.19 Step [get_inputs] completed in 0 seconds
2023-07-19 06:23:35.20 Step [build_inventory_from_scans] starting
2023-07-19 06:23:35.21 Step [build_inventory_from_scans] completed in 0 seconds
2023-07-19 06:23:35.23 Pipeline completed
[SUCCESS] root_filesystems (executed in 1046 seconds)
2023-07-19 06:24:12.61 Pipeline [root_filesystems] starting
2023-07-19 06:24:12.62 Step [extract_input_files_to_codebase_directory] starting
2023-07-19 06:24:30.80 Step [extract_input_files_to_codebase_directory] completed in 18 seconds
2023-07-19 06:24:30.84 Step [find_root_filesystems] starting
2023-07-19 06:24:30.85 Step [find_root_filesystems] completed in 0 seconds
2023-07-19 06:24:30.86 Step [collect_rootfs_information] starting
2023-07-19 06:24:30.86 Step [collect_rootfs_information] completed in 0 seconds
2023-07-19 06:24:30.87 Step [collect_and_create_codebase_resources] starting
2023-07-19 06:30:22.24 Step [collect_and_create_codebase_resources] completed in 351 seconds (5.9 minutes)
2023-07-19 06:30:22.25 Step [collect_and_create_system_packages] starting
2023-07-19 06:30:22.28 Step [collect_and_create_system_packages] completed in 0 seconds
2023-07-19 06:30:22.31 Step [flag_uninteresting_codebase_resources] starting
2023-07-19 06:30:22.33 Step [flag_uninteresting_codebase_resources] completed in 0 seconds
2023-07-19 06:30:22.34 Step [flag_empty_files] starting
2023-07-19 06:30:22.35 Step [flag_empty_files] completed in 0 seconds
2023-07-19 06:30:22.36 Step [flag_ignored_resources] starting
2023-07-19 06:30:22.36 Step [flag_ignored_resources] completed in 0 seconds
2023-07-19 06:30:22.37 Step [scan_for_application_packages] starting
2023-07-19 06:33:26.71 Step [scan_for_application_packages] completed in 184 seconds (3.1 minutes)
2023-07-19 06:33:26.73 Step [match_not_analyzed_to_system_packages] starting
2023-07-19 06:37:24.29 Step [match_not_analyzed_to_system_packages] completed in 238 seconds (4.0 minutes)
2023-07-19 06:37:24.31 Step [scan_for_files] starting
2023-07-19 06:41:38.72 Step [scan_for_files] completed in 254 seconds (4.2 minutes)
2023-07-19 06:41:38.73 Step [analyze_scanned_files] starting
2023-07-19 06:41:38.86 Step [analyze_scanned_files] completed in 0 seconds
2023-07-19 06:41:38.88 Step [flag_not_analyzed_codebase_resources] starting
2023-07-19 06:41:38.90 Step [flag_not_analyzed_codebase_resources] completed in 0 seconds
2023-07-19 06:41:38.91 Pipeline completed
[SUCCESS] scan_codebase (executed in 13818 seconds)
2023-07-19 06:43:47.93 Pipeline [scan_codebase] starting
2023-07-19 06:43:48.37 Step [copy_inputs_to_codebase_directory] starting
2023-07-19 06:43:51.75 Step [copy_inputs_to_codebase_directory] completed in 3 seconds
2023-07-19 06:43:51.76 Step [extract_archives] starting
2023-07-19 07:05:21.04 Step [extract_archives] completed in 1289 seconds (21.5 minutes)
2023-07-19 07:05:21.07 Step [collect_and_create_codebase_resources] starting
2023-07-19 07:21:51.36 Step [collect_and_create_codebase_resources] completed in 990 seconds (16.5 minutes)
2023-07-19 07:21:51.38 Step [flag_empty_files] starting
2023-07-19 07:21:52.18 Step [flag_empty_files] completed in 1 seconds
2023-07-19 07:21:52.22 Step [flag_ignored_resources] starting
2023-07-19 07:21:52.22 Step [flag_ignored_resources] completed in 0 seconds
2023-07-19 07:21:52.23 Step [scan_for_application_packages] starting
2023-07-19 07:48:29.04 Step [scan_for_application_packages] completed in 1597 seconds (26.6 minutes)
2023-07-19 07:48:29.06 Step [scan_for_files] starting
2023-07-19 10:34:06.14 Step [scan_for_files] completed in 9937 seconds (2.8 hours)
2023-07-19 10:34:06.16 Pipeline completed
[SUCCESS] scan_package (executed in 18813 seconds)
2023-07-19 10:34:53.30 Pipeline [scan_package] starting
2023-07-19 10:34:53.72 Step [get_package_archive_input] starting
2023-07-19 10:34:53.73 Step [get_package_archive_input] completed in 0 seconds
2023-07-19 10:34:53.74 Step [collect_archive_information] starting
2023-07-19 10:35:17.63 Step [collect_archive_information] completed in 24 seconds
2023-07-19 10:35:17.66 Step [extract_archive_to_codebase_directory] starting
2023-07-19 10:35:34.32 Step [extract_archive_to_codebase_directory] completed in 17 seconds
2023-07-19 10:35:34.33 Step [run_scancode] starting
2023-07-19 15:42:22.59 Step [run_scancode] completed in 18408 seconds (5.1 hours)
2023-07-19 15:42:22.60 Step [load_inventory_from_toolkit_scan] starting
2023-07-19 15:48:13.89 Step [load_inventory_from_toolkit_scan] completed in 351 seconds (5.9 minutes)
2023-07-19 15:48:13.91 Step [make_summary_from_scan_results] starting
2023-07-19 15:48:26.56 Step [make_summary_from_scan_results] completed in 13 seconds
2023-07-19 15:48:26.57 Pipeline completed
[SUCCESS] find_vulnerabilities (executed in 213 seconds)
2023-07-19 15:48:27.21 Pipeline [find_vulnerabilities] starting
2023-07-19 15:48:30.41 Step [check_vulnerablecode_service_availability] starting
2023-07-19 15:48:32.55 Step [check_vulnerablecode_service_availability] completed in 2 seconds
2023-07-19 15:48:32.57 Step [lookup_vulnerabilities] starting
2023-07-19 15:52:00.66 Step [lookup_vulnerabilities] completed in 208 seconds (3.5 minutes)
2023-07-19 15:52:00.68 Pipeline completed
[FAILURE] inspect_manifest
2023-07-20 04:07:13.34 Pipeline [inspect_manifest] starting
2023-07-20 04:07:13.72 Step [get_manifest_inputs] starting
2023-07-20 04:07:13.74 Step [get_manifest_inputs] completed in 0 seconds
2023-07-20 04:07:13.75 Step [get_packages_from_manifest] starting
2023-07-20 04:07:13.79 Pipeline failed
[FAILURE] deploy_to_develop
2023-07-20 07:31:29.14 Pipeline [deploy_to_develop] starting
2023-07-20 07:31:29.59 Step [get_inputs] starting
2023-07-20 07:31:29.60 Pipeline failed
[SUCCESS] find_vulnerabilities (executed in 165 seconds)
2023-07-25 05:49:52.73 Pipeline [find_vulnerabilities] starting
2023-07-25 05:49:53.11 Step [check_vulnerablecode_service_availability] starting
2023-07-25 05:49:53.95 Step [check_vulnerablecode_service_availability] completed in 1 seconds
2023-07-25 05:49:53.97 Step [lookup_vulnerabilities] starting
2023-07-25 05:52:38.42 Step [lookup_vulnerabilities] completed in 164 seconds (2.7 minutes)
2023-07-25 05:52:38.43 Pipeline completed

it seems that one of the pipeline steps may have an impact on the extra_data structure.
I guess you are right!
As we are using each pipeline one by one. it gives XLSX error at ROOT_FILESYSTEMS Pipeline. After we have run the pipeline we are unable to download XLSX format Report.

@tdruez
Copy link
Member

tdruez commented Jul 28, 2023

As we are using each pipeline one by one.

Help me to understand, why are you running all the pipelines on a single project.
Each pipeline has a specific purpose and expects a certain type of input(s), for example, if I want to scan a Docker image, I use the Docker pipeline.
Running everything one after the other does not make sense and is likely to cause data issues.

Could you clarify what is your initial goal here?
If you want to find the vulnerabilities for a given Docker image, then the docker + find_vulnerabilities is enough.

tdruez added a commit that referenced this issue Jul 28, 2023
Signed-off-by: Thomas Druez <tdruez@nexb.com>
@parvjain639
Copy link
Author

why are you running all the pipelines on a single project.

Actually, to find list of Packages, Dependencies, And Vulnerabilities of a particular project. So, we were trying to run all Pipelines in a single project.

In this case, Please let me know, when to use the following Pipelines

  1. Root_Filesystems
  2. Scan_Codebase
  3. Scan_Packages
  4. Deploy_to_Develop

How to use these Four types of Pipelines! I am unable to understand by Documentation. Can you Please guide us for the same?
As we understand the other Pipelines from Documentation!

Could you clarify what is your initial goal here?

I want complete compliance report having keywords such as IP: patents, royalties, legal, ECC: export, cryptography, AI, newtech, GDRP: privacy, regulations, chatgpt, OSS: attribution, contribution, distribution streamlined obligations compliance, etc.

So, I guess I should run all pipelines for one project, Right??

@tdruez
Copy link
Member

tdruez commented Jul 28, 2023

So, I guess I should run all pipelines for one project, Right??

No.

In this case, Please let me know, when to use the following Pipelines
Root_Filesystems
Scan_Codebase
Scan_Packages
Deploy_to_Develop

Those are not meant to be run side by side, but you need to chose one depending on your input:

  • Your input is a Linux root filesystem, aka rootfs, use root_filesystems
  • Your input is a codebase compressed as an archive, use scan_codebase
  • Your input is a single package archive, use scan_package
  • Your input is a Docker image, use docker
  • Your input is a ScanCode-toolkit or ScanCode.io scan: use load_inventory
  • Your input is a manifest file: use inspect_manifest
  • Your inputs are a development and deployment code tree: use deploy_to_develop

I am unable to understand by Documentation.

Details documentation about pipelines is available at https://scancodeio.readthedocs.io/en/latest/built-in-pipelines.html#

See also the overview of the pipeline in the UI (click on a pipeline name to see the full details of the setps).

Screenshot 2023-07-28 at 13 07 58

@parvjain639
Copy link
Author

Thank You So Much For Clarifications!!!

tdruez added a commit that referenced this issue Jul 28, 2023
Signed-off-by: Thomas Druez <tdruez@nexb.com>
@tdruez
Copy link
Member

tdruez commented Jul 28, 2023

A fix for the initial XLSX download issues has been merged in main in f44fc77

Also, I've improved the documentation regarding the pipeline choices: https://scancodeio.readthedocs.io/en/latest/faq.html#which-pipeline-should-i-use

@tdruez tdruez closed this as completed Jul 28, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants