Skip to content

Commit

Permalink
adding ability to specify architecture and python version (#17)
Browse files Browse the repository at this point in the history
* adding ability to specify architecture and python version
* we should not exit if there is not a match for a particula release
* adding additional function to get python versions

Signed-off-by: vsoch <vsochat@stanford.edu>
  • Loading branch information
vsoch committed Jan 2, 2021
1 parent e8ddb9d commit 47b42cc
Show file tree
Hide file tree
Showing 4 changed files with 95 additions and 8 deletions.
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@ and **Merged pull requests**. Critical items to know are:
The versions coincide with releases on pip.

## [0.2.x](https://github.com/vsoch/caliper/tree/master) (0.0.x)
- ability to specify architecture and python version for packages (0.0.12)
- adding first graph for changedlines metric (0.0.11)
- adding first metric extractors (0.0.1)
- skeleton release (0.0.0)
42 changes: 42 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -74,6 +74,48 @@ manager.specs[-1]
'hash': '238ebd3ca0e0408e0be6780d45deca79583ce99aed05ac6981da7a2b375ae79e'}
```

If you just interact with `manager.specs`, you'll get a random architecture for each
one. This can be okay if you want to do static file analysis, but if you want to choose
a specific python version, your best bet is to call the get package metadata function
directly and provide your preferences. For example, here we want Tensorflow for Python 3.5
and a specific linux architecture:

```python
manager.get_package_metadata(python_version="35", arch="manylinux1_x86_64")
```

To derive these search strings, you can look at examples of wheels provided.
This isn't the default because not all packages provide such rich choices.
Here is an example from an early version of tensorflow.

```
tensorflow-0.12.0-cp27-cp27m-macosx_10_11_x86_64.whl
tensorflow-0.12.0-cp27-cp27mu-manylinux1_x86_64.whl
tensorflow-0.12.0-cp34-cp34m-manylinux1_x86_64.whl
tensorflow-0.12.0-cp35-cp35m-macosx_10_11_x86_64.whl
tensorflow-0.12.0-cp35-cp35m-manylinux1_x86_64.whl
tensorflow-0.12.0-cp35-cp35m-win_amd64.whl
```

For more recent versions you would see Python 3.8 and 3.9, and definitely not 2.x.
The above function still selects one release based on your preferences. You can also choose to return a subset of
_all_ versions with the filter function. For example, here let's narrow down the set
to include those that can be installed on Linux.

```python
releases = manager.filter_releases('manylinux1_x86_64')
```

You can also get a set of unique Python versions across packages:

```python
python_versions = manager.get_python_versions()
# {'cp27', 'cp33', 'cp34', 'cp35', 'cp36', 'cp37', 'cp38'}
```

Not all package versions are guaranteed to have these Python versions, but that's
something interesting to consider. And you can always interact with the raw package metadata at `manager.metadata`.

#### GitHub

We might also be interested in releases from GitHub. Extracting
Expand Down
58 changes: 51 additions & 7 deletions caliper/managers/pypi.py
Original file line number Diff line number Diff line change
Expand Up @@ -15,23 +15,36 @@ class PypiManager(ManagerBase):
name = "pypi"
baseurl = "https://pypi.python.org/pypi"

def get_package_metadata(self, name=None):
"""Given a package name, retrieve it's metadata from pypi"""
def do_metadata_request(self, name=None):
"""A separate, shared function to retrieve package metadata without
doing any custom filtering.
"""
name = name or self.package_name
if not name:
raise ValueError("A package name is required.")

url = "%s/%s/json" % (self.baseurl, name)
self.metadata = do_request(url)

# Note that release[0] can be for any architecture, etc.
# The indexing appears consisent within a package, so OK for now
@property
def releases(self):
if not self.metadata:
self.do_metadata_request()
return self.metadata.get("releases", {})

def get_package_metadata(self, name=None, arch=None, python_version=None):
"""Given a package name, retrieve it's metadata from pypi. Given an arch
regex and python version, we look for a particular architecture. Otherwise
the choices are a bit random.
"""
# Note that without specifying an arch and python version, the
# architecture returned can be fairly random.

# Parse metadata into simplified version of spack package schema
for version, releases in self.metadata.get("releases", {}).items():
for version, releases in self.releases.items():

# Find an appropriate linux/unix flavor release to extract
release = self.find_release(releases)
release = self.find_release(releases, arch, python_version)

# Some releases can be empty, skip
if not releases or not release:
Expand All @@ -55,10 +68,41 @@ def get_package_metadata(self, name=None):
logger.info("Found %s versions for %s" % (len(self._specs), name))
return self._specs

def find_release(self, releases):
def get_python_versions(self):
"""Given a list of releases (or the default) return a list of pep
Python versions (e.g., cp38)
"""
python_versions = set()
for version, releases in self.releases.items():
[
python_versions.add(r["python_version"])
for r in releases
if r["python_version"]
]
return python_versions

def find_release(self, releases=None, arch=None, python_version=None):
"""Given a list of releases, find one that we can extract"""
filename = None
releases = releases or self.releases

if arch:
releases = [r for r in releases if re.search(arch, r["filename"])]
if python_version:
releases = [
r for r in releases if re.search("cp%s" % python_version, r["filename"])
]

for release in releases:
if re.search("(tar[.]gz|[.]whl)", release["url"]):
filename = release
return filename

def filter_releases(self, regex, search_field="filename"):
"""Given a regular expression, filter releases down to smaller list"""
filtered = {}
for version, releases in self.releases.items():
filtered[version] = [
r for r in releases if re.search(regex, r.get(search_field, ""))
]
return filtered
2 changes: 1 addition & 1 deletion caliper/version.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
__copyright__ = "Copyright 2020-2021, Vanessa Sochat"
__license__ = "MPL 2.0"

__version__ = "0.0.11"
__version__ = "0.0.12"
AUTHOR = "Vanessa Sochat"
AUTHOR_EMAIL = "vsochat@stanford.edu"
NAME = "caliper"
Expand Down

0 comments on commit 47b42cc

Please sign in to comment.