Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BaseMetadata #24

Closed
nkanazawa1989 opened this issue Apr 1, 2021 · 5 comments
Closed

BaseMetadata #24

nkanazawa1989 opened this issue Apr 1, 2021 · 5 comments
Labels
enhancement New feature or request
Milestone

Comments

@nkanazawa1989
Copy link
Collaborator

nkanazawa1989 commented Apr 1, 2021

What is the expected behavior?

As I wrote here, having a formatted metadata will be useful to extract x and y values, see #23. Currently, xvalue appears as different name in PRs ("delay" in #5 , "meas_basis" in #7, "xdata" in #18 ), so the naming rule is up to person who implements the module. Though this improves readability of metadata, this will be a real headache to write the analysis superclass.

Here I propose to define dataclass with some helper method:

@dataclasses.dataclass
class ExperimentMetadata:
    experiment_type: str
    qubits: List[int]
    exp_id: str = None
    
    def to_dict(self):
        return dataclasses.asdict(self)
    
    def check_entry(self, **series_kwargs):
        return all(self.to_dict()[key] == value for key, value in series_kwargs.items())

    @abstractmethod
    def get_x_value(self) -> Any:

We assume we can identify an experiment entry with x_value and series, i.e. x_value is horizontal axis of the graph, while series indicates a label of line. Some experiment may have only series, values can be provided by a method so that we don't need to fill metadata with empty value (still we can guarantee the extraction method proposed in #23 can access to values).

The extraction method may become

def extract_xy_values(exp_data: ExperimentData, **series: str)

since x_value is provided by metadata itself. Series becomes kwargs because it may be defined by a dictionary.

# e.g. QPT
extract_xy_values(exp_data, meas_basis=('X',), prep_basis=('Xp',))

The .check_entry method will return True if input kwargs mathces with the metadata.

I assume we can cover almost all typical experiments with below 3 sub types:

No scan:

Discriminator experiment

@dataclasses.dataclass
class DiscriminatorExperiment(ExperimentMetadata):
    prep_state: str
    
    def get_x_value(self) -> float:
        return None

extract_xy_values(exp_data, prep_state='00')

Process tomography

@dataclasses.dataclass
class TomographyMetadata(ExperimentMetadata):
    meas_basis: str
    prep_basis: str
    
    def get_x_value(self) -> float:
        return None

extract_xy_values(exp_data, meas_basis=('X',), prep_basis=('Xp',))

Line scan:

Interleaved randomized benchmarking

@dataclasses.dataclass
class RBMetadata(ExperimentMetadata):
    n_clifford: float
    interleaved: bool
    
    def get_x_value(self) -> float:
        return self.n_clifford

extract_xy_values(exp_data, interleaved=True)

T1 measurement

@dataclasses.dataclass
class T1Metadata(ExperimentMetadata):
    delay: int
    
    def get_x_value(self) -> float:
        return self.delay

extract_xy_values(exp_data)

Line scan with multiple series

Hamiltonian tomography

@dataclasses.dataclass
class HamTomographyMetadata(ExperimentMetadata):
    pulse_duration: int
    meas_basis: str
    control_state: int
    
    def get_x_value(self) -> float:
        return self.pulse_duration

extract_xy_values(exp_data, meas_basis='X', control_state=0)
@nkanazawa1989 nkanazawa1989 added the enhancement New feature or request label Apr 1, 2021
@nkanazawa1989
Copy link
Collaborator Author

Perhaps the last one can be just a line scan. So there are roughly two types. Another case would be 2D scan, but I don't have any mandatory experiment for now. Thus it can be future extension.

@chriseclectic
Copy link
Collaborator

chriseclectic commented Apr 1, 2021

A dataclass seems like a nice way to document metadata for an experiment. Though if we use them we should try and use the dataclass package functionality rather than add our own dict like API wrapper on top of it. Most of the extra methods here seem unnecessary:

  • rather than wrap to_dict just use its built in asdict, though in general if we are using dataclass we should avoid using them as dicts except when serializing for execution or DB storage.
  • check_entry should use hasattr(datacls, attr) not conversion to dict. You could probably do something with dataclasses.fields instead. Part of the point of using dataclasses is you dont need to do these checks, you can use type checking instead to know it has certain fields or not.
  • I don't think building in methods for returning xvalues is necessary. If an experiment has a metadata parameter for x-values it should just save it as xval: float. This can just be accessed using data.xval or `getattr(data, xval). Of course this only matters for experiments where you plan on using common analysis classes (such as curve fitting).

@nkanazawa1989
Copy link
Collaborator Author

nkanazawa1989 commented Apr 5, 2021

Thanks Chris for the feedback

  • About to_dict; Yes we can use built-in function. However, this indicates we need to import dataclass in many modules. This is just to avoid importing dataclass.
  • About check_entry; I totally agree. The code above is just an example to show concept (maybe I should write pseudo code...).
  • About xval; I think this depends who uses metadata. If metadata is a public member (and intended to be exposed to end-users), I prefer more meaningful name rather than just xval. Another useful case of method is multi axes scan. Adding data.xval1 and data.xval2 will be the most straightforward approach to create metadata, but this is quite confusing and developer may induce a bug. If this is implemented as a method, we can override this with, for example
    def get_x_value(self) -> Tuple[float, float]:
        return (self.frequency, self.amp)

@chriseclectic
Copy link
Collaborator

Anything the end user cares about should be in the analysis result, and the average user never needs to see raw metadata. It's only there so you can run the analysis. This is why I think it should just be a simple container (whether dict, simple namespace, dataclass etc), not something you add a lot of methods to. Extra methods for getting values or processing should be part of the analysis classes + data processor classes where the metadata is used.

It seems like most of what this is aiming to accomplish with the x-y values would also be accomplished by having general purpose analysis classes for the various types of curve fitting. These should then be used as parent classes for other experiments.

@nkanazawa1989
Copy link
Collaborator Author

I understand. It sounds like dataclass is not necessary and analysis should know the field of metadata, i.e. generator and analysis are attached to the same experiment. The purpose of this issue is to decouple metadata from analysis to make general purpose analysis as you mentioned. This means extract function API proposed in #23 is sufficient and I can close the issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants