# **Olivia-Finder introduction**

Olivia Finder is an open source tool for extracting data from software package dependency networks in package managers, designed to be used in conjunction with Olivia.
Olivia Finder uses the web-scraping technique to get updated data, in addition to CSV files as another data source.


**You can find the documentation in:**

<a href="https://dab0012.github.io/olivia-finder">
<svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" width="132" height="20" role="img" aria-label="docs: at Github Pages"><title>docs: at Github Pages</title><linearGradient id="s" x2="0" y2="100%"><stop offset="0" stop-color="#bbb" stop-opacity=".1"/><stop offset="1" stop-opacity=".1"/></linearGradient><clipPath id="r"><rect width="132" height="20" rx="3" fill="#fff"/></clipPath><g clip-path="url(#r)"><rect width="35" height="20" fill="#555"/><rect x="35" width="97" height="20" fill="#4c1"/><rect width="132" height="20" fill="url(#s)"/></g><g fill="#fff" text-anchor="middle" font-family="Verdana,Geneva,DejaVu Sans,sans-serif" text-rendering="geometricPrecision" font-size="110"><text aria-hidden="true" x="185" y="150" fill="#010101" fill-opacity=".3" transform="scale(.1)" textLength="250">docs</text><text x="185" y="140" transform="scale(.1)" fill="#fff" textLength="250">docs</text><text aria-hidden="true" x="825" y="150" fill="#010101" fill-opacity=".3" transform="scale(.1)" textLength="870">at Github Pages</text><text x="825" y="140" transform="scale(.1)" fill="#fff" textLength="870">at Github Pages</text></g></svg>
</a>

**Author:**

Daniel Alonso Báscones

## **Previous requirements**

**<span style="color: crimson">
Important:
</span>**

We make sure to have the requirements installed


In [None]:
%pip install -r ../requirements.txt

Add the Library Route to Path

In [1]:
# Add the path to the olivia_finder package
import sys
sys.path.append('../')

## **DataSource**

The datasource interface allows us to obtain data from different sources:

At this time we have two implementations available

-   Web Scraping based
-   CSV files based


In the first place we import the implementation of the data source we want, for this example we will use the Bioconductor Scraper

#### Initialization of the class

In [2]:
from olivia_finder.data_source.scrapers.bioconductor import BiocScraper
bioc_scraper_ds = BiocScraper()

In [2]:
from olivia_finder.data_source.csv_network import CSVNetwork
# Load the network
bioc_csv_ds = CSVNetwork(
    "results/csv_datasets/bioconductor_adjlist_scraping.csv",  # Path to the CSV file
    "Bioconductor",                         # Name of the data source
    "Bioconductor as a CSV file",            # Description of the data source
    dependent_field="name",                 # Name of the field that contains the dependencies
    dependency_field="dependency",          # Name of the field that contains the name of the package
    dependent_version_field="version",      # Name of the field that contains the version of the package
    dependency_version_field="dependency_version",     # Name of the field that contains the version of the dependency
    dependent_url_field="url",              # Name of the field that contains the URL of the package
)

#### Obtain package names

Specifically, the BiocScraper class gets the list of packages from <a href="https://bioconductor.org/packages/release/BiocViews.html#___Software">Bioconductor packages list</a>

Each specific implementation of a Scraper must manage this process on its own.

In [3]:
package_list = bioc_scraper_ds.obtain_package_names()
package_list[:10]

['ABSSeq',
 'ABarray',
 'ACE',
 'ACME',
 'ADAM',
 'ADAMgui',
 'ADImpute',
 'ADaCGH2',
 'AGDEX',
 'AIMS']

On the other hand, the CSV-based implementation obtains the names of the dataset

In [4]:
package_list = bioc_csv_ds.obtain_package_names()
package_list[:10]

['ABSSeq',
 'ABarray',
 'ACE',
 'ACME',
 'ADAM',
 'ADAMgui',
 'ADImpute',
 'ADaCGH2',
 'AGDEX',
 'AIMS']

#### Obtain package data

We can get the data from a list of package names using the function:
-   ```python
    obtain_packages_data(list[str])
    ```

In [None]:
deepbluer = bioc_scraper_ds.obtain_package_data("DeepBlueR")
deepbluer

{'name': 'DeepBlueR',
 'version': '1.24.1',
 'dependencies': [{'name': 'R', 'version': '>= 3.3'},
  {'name': 'XML', 'version': ''},
  {'name': 'RCurl', 'version': ''},
  {'name': 'GenomicRanges', 'version': ''},
  {'name': 'data.table', 'version': ''},
  {'name': 'stringr', 'version': ''},
  {'name': 'diffr', 'version': ''},
  {'name': 'dplyr', 'version': ''},
  {'name': 'methods', 'version': ''},
  {'name': 'rjson', 'version': ''},
  {'name': 'utils', 'version': ''},
  {'name': 'R.utils', 'version': ''},
  {'name': 'foreach', 'version': ''},
  {'name': 'withr', 'version': ''},
  {'name': 'rtracklayer', 'version': ''},
  {'name': 'GenomeInfoDb', 'version': ''},
  {'name': 'settings', 'version': ''},
  {'name': 'filehash', 'version': ''}],
 'url': 'https://www.bioconductor.org/packages/release/bioc/html/DeepBlueR.html'}

Be careful with the sensitivity to **caps**, if the package has not been found, an **ScraperError** exception is raised

In [5]:
try:
    deepbluer2 = bioc_scraper_ds.obtain_package_data("deepbluer")
except Exception as e:
    print(e)

ScraperError: Package deepbluer not found


In [5]:
deepbluer = bioc_csv_ds.obtain_package_data("DeepBlueR")
deepbluer

{'name': 'DeepBlueR',
 'version': '1.24.1',
 'url': 'https://www.bioconductor.org/packages/release/bioc/html/DeepBlueR.html',
 'dependencies': [{'name': 'R', 'version': '>= 3.3'},
  {'name': 'XML', 'version': nan},
  {'name': 'RCurl', 'version': nan},
  {'name': 'GenomicRanges', 'version': nan},
  {'name': 'data.table', 'version': nan},
  {'name': 'stringr', 'version': nan},
  {'name': 'diffr', 'version': nan},
  {'name': 'dplyr', 'version': nan},
  {'name': 'methods', 'version': nan},
  {'name': 'rjson', 'version': nan},
  {'name': 'utils', 'version': nan},
  {'name': 'R.utils', 'version': nan},
  {'name': 'foreach', 'version': nan},
  {'name': 'withr', 'version': nan},
  {'name': 'rtracklayer', 'version': nan},
  {'name': 'GenomeInfoDb', 'version': nan},
  {'name': 'settings', 'version': nan},
  {'name': 'filehash', 'version': nan}]}

#### Obtain packages data

In [None]:
pkgs_data, not_found = bioc_scraper_ds.obtain_packages_data(package_list[:3])
pkgs_data

[{'name': 'ABSSeq',
  'version': '1.52.0',
  'dependencies': [{'name': 'R', 'version': '>= 2.10'},
   {'name': 'methods', 'version': ''},
   {'name': 'locfit', 'version': ''},
   {'name': 'limma', 'version': ''}],
  'url': 'https://www.bioconductor.org/packages/release/bioc/html/ABSSeq.html'},
 {'name': 'ABarray',
  'version': '1.66.0',
  'dependencies': [{'name': 'Biobase', 'version': ''},
   {'name': 'graphics', 'version': ''},
   {'name': 'grDevices', 'version': ''},
   {'name': 'methods', 'version': ''},
   {'name': 'multtest', 'version': ''},
   {'name': 'stats', 'version': ''},
   {'name': 'tcltk', 'version': ''},
   {'name': 'utils', 'version': ''}],
  'url': 'https://www.bioconductor.org/packages/release/bioc/html/ABarray.html'},
 {'name': 'ACE',
  'version': '1.16.0',
  'dependencies': [{'name': 'R', 'version': '>= 3.4'},
   {'name': 'Biobase', 'version': ''},
   {'name': 'QDNAseq', 'version': ''},
   {'name': 'ggplot2', 'version': ''},
   {'name': 'grid', 'version': ''},
   {

In [7]:
packages = bioc_csv_ds.obtain_packages_data(package_list[:3])
packages

[{'name': 'ABSSeq',
  'version': '1.52.0',
  'url': 'https://www.bioconductor.org/packages/release/bioc/html/ABSSeq.html',
  'dependencies': [{'name': 'R', 'version': '>= 2.10'},
   {'name': 'methods', 'version': nan},
   {'name': 'locfit', 'version': nan},
   {'name': 'limma', 'version': nan}]},
 {'name': 'ABarray',
  'version': '1.66.0',
  'url': 'https://www.bioconductor.org/packages/release/bioc/html/ABarray.html',
  'dependencies': [{'name': 'Biobase', 'version': nan},
   {'name': 'graphics', 'version': nan},
   {'name': 'grDevices', 'version': nan},
   {'name': 'methods', 'version': nan},
   {'name': 'multtest', 'version': nan},
   {'name': 'stats', 'version': nan},
   {'name': 'tcltk', 'version': nan},
   {'name': 'utils', 'version': nan}]},
 {'name': 'ACE',
  'version': '1.16.0',
  'url': 'https://www.bioconductor.org/packages/release/bioc/html/ACE.html',
  'dependencies': [{'name': 'R', 'version': '>= 3.4'},
   {'name': 'Biobase', 'version': nan},
   {'name': 'QDNAseq', 'versi

Packages not found appear as the second object of the tuple

In [7]:
pkgs_data, not_found = bioc_scraper_ds.obtain_packages_data(["deepbluer", "DeepBlueR"])
not_found

['deepbluer']

#### Obtain dependencies recursively

In [3]:
dependency_network = bioc_csv_ds.generate_package_dependency_network("DeepBlueR")
dependency_network

2023-03-24 18:57:36 [   DEBUG] Package R not found in data. (logger.py:122)
2023-03-24 18:57:36 [   DEBUG] The package R does not exist in the data source Bioconductor (logger.py:122)
2023-03-24 18:57:36 [   DEBUG] Package XML not found in data. (logger.py:122)
2023-03-24 18:57:36 [   DEBUG] The package XML does not exist in the data source Bioconductor (logger.py:122)
2023-03-24 18:57:36 [   DEBUG] Package RCurl not found in data. (logger.py:122)
2023-03-24 18:57:36 [   DEBUG] The package RCurl does not exist in the data source Bioconductor (logger.py:122)
2023-03-24 18:57:36 [   DEBUG] Package R not found in data. (logger.py:122)
2023-03-24 18:57:36 [   DEBUG] The package R does not exist in the data source Bioconductor (logger.py:122)
2023-03-24 18:57:36 [   DEBUG] Package methods not found in data. (logger.py:122)
2023-03-24 18:57:36 [   DEBUG] The package methods does not exist in the data source Bioconductor (logger.py:122)
2023-03-24 18:57:36 [   DEBUG] Package stats4 not found 

{'DeepBlueR': [{'name': 'R', 'version': '>= 3.3'},
  {'name': 'XML', 'version': nan},
  {'name': 'RCurl', 'version': nan},
  {'name': 'GenomicRanges', 'version': nan},
  {'name': 'data.table', 'version': nan},
  {'name': 'stringr', 'version': nan},
  {'name': 'diffr', 'version': nan},
  {'name': 'dplyr', 'version': nan},
  {'name': 'methods', 'version': nan},
  {'name': 'rjson', 'version': nan},
  {'name': 'utils', 'version': nan},
  {'name': 'R.utils', 'version': nan},
  {'name': 'foreach', 'version': nan},
  {'name': 'withr', 'version': nan},
  {'name': 'rtracklayer', 'version': nan},
  {'name': 'GenomeInfoDb', 'version': nan},
  {'name': 'settings', 'version': nan},
  {'name': 'filehash', 'version': nan}],
 'GenomicRanges': [{'name': 'R', 'version': '>= 4.0.0'},
  {'name': 'methods', 'version': nan},
  {'name': 'stats4', 'version': nan},
  {'name': 'BiocGenerics', 'version': nan},
  {'name': 'S4Vectors', 'version': nan},
  {'name': 'IRanges', 'version': nan},
  {'name': 'GenomeInfoD

In [3]:
from olivia_finder.data_source.scrapers.pypi import PypiScraper
pypi_scraper = PypiScraper()
networkx_network = pypi_scraper.generate_package_dependency_network("networkx")
networkx_network

2023-03-24 19:39:14 [   DEBUG] Added SSLProxies to proxy builders (logger.py:122)
2023-03-24 19:39:14 [   DEBUG] Added FreeProxyList to proxy builders (logger.py:122)
2023-03-24 19:39:14 [   DEBUG] Added GeonodeProxy to proxy builders (logger.py:122)
2023-03-24 19:39:14 [   DEBUG] Starting new HTTPS connection (1): www.sslproxies.org:443 (connectionpool.py:973)
2023-03-24 19:39:14 [   DEBUG] https://www.sslproxies.org:443 "GET / HTTP/1.1" 200 None (connectionpool.py:452)
2023-03-24 19:39:14 [   DEBUG] Found 100 proxies from SSLProxies (logger.py:122)
2023-03-24 19:39:14 [   DEBUG] Starting new HTTPS connection (1): free-proxy-list.net:443 (connectionpool.py:973)
2023-03-24 19:39:14 [   DEBUG] https://free-proxy-list.net:443 "GET /anonymous-proxy.html HTTP/1.1" 200 None (connectionpool.py:452)
2023-03-24 19:39:14 [   DEBUG] Found 100 proxies from FreeProxyList (logger.py:122)
2023-03-24 19:39:14 [   DEBUG] Starting new HTTPS connection (1): proxylist.geonode.com:443 (connectionpool.py:9

{'networkx': [{'name': 'numpy', 'version': None},
  {'name': 'scipy', 'version': None},
  {'name': 'matplotlib', 'version': None},
  {'name': 'pandas', 'version': None},
  {'name': 'pre', 'version': None},
  {'name': 'mypy', 'version': None},
  {'name': 'sphinx', 'version': None},
  {'name': 'pydata', 'version': None},
  {'name': 'numpydoc', 'version': None},
  {'name': 'pillow', 'version': None},
  {'name': 'nb2plots', 'version': None},
  {'name': 'texext', 'version': None},
  {'name': 'lxml', 'version': None},
  {'name': 'pygraphviz', 'version': None},
  {'name': 'pydot', 'version': None},
  {'name': 'sympy', 'version': None},
  {'name': 'pytest', 'version': None},
  {'name': 'codecov', 'version': None}],
 'numpy': [],
 'scipy': [{'name': 'numpy', 'version': None},
  {'name': 'pytest', 'version': None},
  {'name': 'asv', 'version': None},
  {'name': 'mpmath', 'version': None},
  {'name': 'gmpy2', 'version': None},
  {'name': 'threadpoolctl', 'version': None},
  {'name': 'scikit', 've

In [2]:
from olivia_finder.data_source.scrapers.cran import CranScraper
cran_scraper = CranScraper()
drake_network = cran_scraper.generate_package_dependency_network("drake")
drake_network

2023-03-24 19:33:18 [   DEBUG] Added SSLProxies to proxy builders (logger.py:122)
2023-03-24 19:33:18 [   DEBUG] Added FreeProxyList to proxy builders (logger.py:122)
2023-03-24 19:33:18 [   DEBUG] Added GeonodeProxy to proxy builders (logger.py:122)
2023-03-24 19:33:18 [   DEBUG] Starting new HTTPS connection (1): www.sslproxies.org:443 (connectionpool.py:973)
2023-03-24 19:33:18 [   DEBUG] https://www.sslproxies.org:443 "GET / HTTP/1.1" 200 None (connectionpool.py:452)
2023-03-24 19:33:18 [   DEBUG] Found 100 proxies from SSLProxies (logger.py:122)
2023-03-24 19:33:18 [   DEBUG] Starting new HTTPS connection (1): free-proxy-list.net:443 (connectionpool.py:973)
2023-03-24 19:33:19 [   DEBUG] https://free-proxy-list.net:443 "GET /anonymous-proxy.html HTTP/1.1" 200 None (connectionpool.py:452)
2023-03-24 19:33:19 [   DEBUG] Found 100 proxies from FreeProxyList (logger.py:122)
2023-03-24 19:33:19 [   DEBUG] Starting new HTTPS connection (1): proxylist.geonode.com:443 (connectionpool.py:9

{'drake': [{'name': 'R', 'version': '≥ 3.3.0'},
  {'name': 'base64url', 'version': ''},
  {'name': 'digest', 'version': '≥ 0.6.21'},
  {'name': 'igraph', 'version': ''},
  {'name': 'methods', 'version': ''},
  {'name': 'parallel', 'version': ''},
  {'name': 'rlang', 'version': '≥ 0.2.0'},
  {'name': 'storr', 'version': '≥ 1.1.0'},
  {'name': 'tidyselect', 'version': '≥ 1.0.0'},
  {'name': 'txtq', 'version': '≥ 0.2.3'},
  {'name': 'utils', 'version': ''},
  {'name': 'vctrs', 'version': '≥ 0.2.0'}],
 'digest': [{'name': 'R', 'version': '≥ 3.3.0'},
  {'name': 'utils', 'version': ''}],
 'igraph': [{'name': 'methods', 'version': ''},
  {'name': 'graphics', 'version': ''},
  {'name': 'grDevices', 'version': ''},
  {'name': 'magrittr', 'version': ''},
  {'name': 'Matrix', 'version': ''},
  {'name': 'pkgconfig', 'version': '≥ 2.0.0'},
  {'name': 'rlang', 'version': ''},
  {'name': 'stats', 'version': ''},
  {'name': 'utils', 'version': ''}],
 'Matrix': [{'name': 'R', 'version': '≥ 3.5.0'},
  {

In [3]:
from olivia_finder.data_source.scrapers.npm import NpmScraper
npm_scraper = NpmScraper()
express_network = npm_scraper.generate_package_dependency_network("express")
express_network

{'express': [{'name': 'accepts', 'version': '~1.3.8'},
  {'name': 'array-flatten', 'version': '1.1.1'},
  {'name': 'body-parser', 'version': '1.20.1'},
  {'name': 'content-disposition', 'version': '0.5.4'},
  {'name': 'content-type', 'version': '~1.0.4'},
  {'name': 'cookie', 'version': '0.5.0'},
  {'name': 'cookie-signature', 'version': '1.0.6'},
  {'name': 'debug', 'version': '2.6.9'},
  {'name': 'depd', 'version': '2.0.0'},
  {'name': 'encodeurl', 'version': '~1.0.2'},
  {'name': 'escape-html', 'version': '~1.0.3'},
  {'name': 'etag', 'version': '~1.8.1'},
  {'name': 'finalhandler', 'version': '1.2.0'},
  {'name': 'fresh', 'version': '0.5.2'},
  {'name': 'http-errors', 'version': '2.0.0'},
  {'name': 'merge-descriptors', 'version': '1.0.1'},
  {'name': 'methods', 'version': '~1.1.2'},
  {'name': 'on-finished', 'version': '2.4.1'},
  {'name': 'parseurl', 'version': '~1.3.3'},
  {'name': 'path-to-regexp', 'version': '0.1.7'},
  {'name': 'proxy-addr', 'version': '~2.0.7'},
  {'name': '

## **Package manager**

In [3]:
from olivia_finder.package_manager import PackageManager

### **Initialization**

**Declare the class**

Initialize the packagemanager class with the implementation of the scraper we want

In [4]:
from olivia_finder.data_source.scrapers.pypi import PypiScraper
pypi_scraper_pm = PackageManager(PypiScraper())

from olivia_finder.data_source.scrapers.npm import NpmScraper
npm_scraper_pm = PackageManager(NpmScraper())

Or init the class from csv file

In [4]:
from olivia_finder.data_source.csv_network import CSVNetwork

# Cran data from scraping
cran_scraped_csv_pm = PackageManager(
    CSVNetwork(
        "results/csv_datasets/cran_adjlist_scraping.csv",   # Path to the CSV file
        "CRAN",                                             # Name of the data source
        "CRAN as a CSV file",                               # Description of the data source
        dependent_field="name",                             # Name of the field that contains the dependencies
        dependency_field="dependency",                      # Name of the field that contains the name of the package
        dependent_version_field="version",                  # Name of the field that contains the version of the package
        dependency_version_field="dependency_version",     # Name of the field that contains the version of the dependency
        dependent_url_field="url",                          # Name of the field that contains the URL of the package
    )
)

# Cran data from libraries.io
cran_librariesio_csv_pm = PackageManager(
    CSVNetwork(
        "results/csv_datasets/cran_librariesio_dependencies.csv",   # Path to the CSV file  
        "CRAN",                                                     # Name of the data source
        "CRAN as a CSV file",                                       # Description of the data source
        dependent_field="Project Name",                             # Name of the field that contains the dependencies
        dependency_field="Dependency Name",                         # Name of the field that contains the name of the package
        dependent_version_field="Version Number",                   # Name of the field that contains the version of the package
        dependency_version_field="Dependency Requirements"          # Name of the field that contains the version of the dependency
    )
)

### **Obtain packages**

#### Get a package from package manager

In [10]:
networkx = pypi_scraper_pm.obtain_package("networkx")
networkx.print()

Package:
  name: networkx
  version: 3.0
  url: https://pypi.org/project/networkx/
  dependencies:
    numpy:(>=1.20)
    scipy:(>=1.8)
    matplotlib:(>=3.4)
    pandas:(>=1.3)
    pre-commit:(>=2.20)
    mypy:(>=0.991)
    sphinx:(==5.2.3)
    pydata-sphinx-theme:(>=0.11)
    sphinx-gallery:(>=0.11)
    numpydoc:(>=1.5)
    pillow:(>=9.2)
    nb2plots:(>=0.6)
    texext:(>=0.6.7)
    lxml:(>=4.6)
    pygraphviz:(>=1.10)
    pydot:(>=1.4.2)
    sympy:(>=1.10)
    pytest:(>=7.2)
    pytest-cov:(>=4.0)
    codecov:(>=2.1)


In [5]:
cran_scraped_csv_pm.obtain_package("A3").print()

Package:
  name: A3
  version: 1.0.0
  url: https://cran.r-project.org/package=A3
  dependencies:
    R:≥ 2.15.0
    xtable:nan
    pbapply:nan


#### Get packages from a list of package names

Webscraping-based implementation obtains the data manager website data

In [11]:
packages = pypi_scraper_pm.obtain_packages(["networkx", "numpy", "pandas"])
packages

[<olivia_finder.package.Package at 0x7fed5937bd90>,
 <olivia_finder.package.Package at 0x7fed27ae2250>,
 <olivia_finder.package.Package at 0x7fed27ae26a0>]

CSV file-based implementation obtains file data from the csv

In [12]:
cran_packages = cran_scraped_csv_pm.obtain_packages(["A3", "pbapply", "xtable"])
cran_packages

[<olivia_finder.package.Package at 0x7f28bada7340>,
 <olivia_finder.package.Package at 0x7f28bad7c5e0>,
 <olivia_finder.package.Package at 0x7f28bad7e0b0>]

#### Get all packages from a package manager

***Note:***

The functionality of storing packages in the PackageManager object has been implemented

-   Can be activated by flag

    ```python
    extend=True
    ```

The functionality of showing the progress of obtaining packages has been implemented

-   Can be activated by flag

    ```python
    show_progress=True
    ```

Getting all the packages from a package manager can take a while, so it is recommended to save the data to a CSV file for later use.

We can see that the execution time for half a million packages (Pypi) is around 7 hours.
In the case of Bioconductor, to obtain the 2000 packages it contains, the execution time is around 4 minutes.

-   From **Spraper** data source implementation

In [7]:
pypi_packages = pypi_scraper_pm.obtain_packages(extend=True, show_progress=True)

 15%|█▌        | 66423/438514 [1:14:50<6:40:39, 15.48it/s] 

In [None]:
bioc_scraper_pm = PackageManager(BiocScraper())
bioconductor_packages = bioc_scraper_pm.obtain_packages(extend=True, show_progress=True)

100%|██████████| 2183/2183 [03:47<00:00,  9.59it/s]


-   From **CSVNetwork** data source implementation

In [8]:
cran_packages = cran_scraped_csv_pm.obtain_packages(extend=True, show_progress=True)

100%|██████████| 18195/18195 [02:04<00:00, 145.67it/s]


As can be seen there is inconsistency among the different data sources, it is recommended to use the most up-to-date source

### **Data persistence**

The functionality of saving the PackageManager object in disk and loading of it has also been implemented, in order to maintain persistence and not repeat processes such as WebScraping.

**Save the PackageManager object**

We can save the object through the `save` function

The file extension is irrelevant since it is a binary serialization, but by agreement the extension has been chosen **.olvpm** "to identify the PackageManager files

In [13]:
cran_scraped_csv_pm.save("results/package_managers/cran.olvpm")

**Load the PackageManager object**

We can load the PackageManager object through the static method

```python 
    PackageManager.load(path:str)
```

In [6]:
cran_loaded_csv_pm = PackageManager.load("results/package_managers/cran.olvpm")

**Export the CSV format**

We can export the data of the packages to a CSV, with a structure similar to that of the data of Libraries.

We can use the following function to generate a Pandas Dataframe and then write the file as CSV

-   
    ```python
    pandas_df = package_manager.export_adjlist()
    ```


In [7]:
# Store the package manager as a adjacency list
cran_df = cran_loaded_csv_pm.export_adjlist()
cran_df.to_csv("results/csv_datasets/cran_full_adjlist.csv", index=False)
cran_df.head()

Unnamed: 0,name,dependency
0,A3,R
1,A3,xtable
2,A3,pbapply
3,AATtools,R
4,AATtools,magrittr
