# chemhist â€” Installation and Usage Example

This notebook demonstrates how to install and use the `chemhist` package to generate histogram-based descriptors from chemical formulas.

## 1. Installation

You can install the package locally using pip:
```bash
cd path/to/chemhist_project
pip install .
```
Or in editable mode (recommended for development):
```bash
pip install -e .
```
If you encounter build errors, remove previous build directories:
```bash
Remove-Item -Recurse -Force build, dist, chemhist.egg-info
```

In [2]:
import chemhist
print(chemhist.__file__)

None


## 2. Generate Histogram Descriptors
Use `get_descriptor` to convert a chemical formula into a numerical histogram vector.

In [3]:
from chemhist import get_descriptor

formula = 'Li0.5Mn1.0O2'
vec, labels = get_descriptor(formula, algebricdesc=True, matrixdesc=True)

print('Descriptor vector length:', len(vec))
print('First 10 values:', vec[:10])
print('First 10 labels:', labels[:10])

ImportError: cannot import name 'get_descriptor' from 'chemhist' (unknown location)

## 3. Visualize the Descriptor
Display the descriptor as a histogram.

In [None]:
import matplotlib.pyplot as plt

plt.figure(figsize=(12, 4))
plt.bar(range(len(vec)), vec, color='steelblue')
plt.xlabel('Descriptor Index')
plt.ylabel('Value')
plt.title(f'Histogram Descriptor for {formula}')
plt.tight_layout()
plt.show()

## 4. Grouped Visualization by Property
Plot the descriptor grouped by property name (EN, AN, IR, etc.), preserving the original order of appearance.

In [None]:
props = [l.split('_')[0] for l in labels]
unique_props = []
for p in props:
    if p not in unique_props:
        unique_props.append(p)

plt.figure(figsize=(12, 4))
for p in unique_props:
    idx = [i for i, l in enumerate(labels) if l.startswith(p)]
    plt.bar(idx, vec[idx], label=p)

plt.legend(unique_props, ncol=4)
plt.xlabel('Descriptor Index (in original order)')
plt.ylabel('Value')
plt.title(f'Histogram Descriptors Grouped by Property ({formula})')
plt.tight_layout()
plt.show()

## 5. Save Descriptor to CSV
Export the computed descriptor vector and labels to a CSV file.

In [None]:
import pandas as pd

df = pd.DataFrame([vec], columns=labels)
df.insert(0, 'formula', formula)
df.to_csv('chemhist_output.csv', index=False)
print('Saved as chemhist_output.csv')