# Exploring UCC cluster data

This notebook allows you to explore the data for each cluster in the UCC. The cluster datafiles contain the frame processed by `fastMP` with membership probabilities assigned for all the stars.

## Import packages and load data

First we define the name of the cluster to be analyzed:

In [None]:
cluster = "cwnu3680"

and import the required packages:

In [None]:
import pandas as pd
import numpy as np
from bokeh.plotting import figure, show
from bokeh.models import LinearColorMapper
from bokeh.models import ColumnDataSource
from bokeh.models import ColorBar
from bokeh.models import Range1d
from bokeh.io import output_notebook
output_notebook()

After importing the required packages, we load the cluster data into the dataframe `df`

In [None]:
path = "https://github.com/ucc23/Q1P/raw/main/datafiles/"
df = pd.read_parquet(path + cluster + ".parquet")

The most probable members are stored using a probability cut `P>0.5`. A minimum number of member stars is set to `25`, so that if less than `25` stars have `P>0.5` then the most probable members are those `25` stars with the largest probability values.

## Define plotting functions


Define a function to generate scatter plots

In [None]:
def scatter_plot(x, y, col, flip_yaxis=False):
    members = ColumnDataSource({'xm':df[x], 'ym':df[y], 'col':df[col]})
    cmap = LinearColorMapper(palette="Viridis256", low = .5, high = 1)

    p = figure()
    p.circle("xm", "ym", size=10, source=members, line_color='black', alpha=.75,
            fill_color={"field":"col", "transform":cmap})
    if flip_yaxis:
        p.y_range.flipped = True
    bar = ColorBar(color_mapper=cmap)
    p.add_layout(bar, "right")
    p.xaxis.axis_label = x
    p.yaxis.axis_label = y
    show(p)


And a function to generate histograms

In [None]:
def histo_plot(x):
    p = figure()
    # Histogram for member stars
    hist, edges = np.histogram(df[x], bins=20)
    p.quad(top=hist, bottom=0, left=edges[:-1], right=edges[1:],
            fill_color="skyblue", line_color="white", alpha=.75)
    # Vertical line
    p.ray(x=[np.nanmedian(df[x])], y=[0], length=0, angle=90,
          angle_units='deg', line_width=3, line_color='red')
    left, right = np.nanmin(df[x]) * 0.9, np.nanmax(df[x]) * 1.1
    p.x_range=Range1d(left, right)
    p.xaxis.axis_label = x
    p.yaxis.axis_label = "N"
    show(p)

## Generate interactive plots

Now we can generate some interactive plots. For example the distribution of galactic coordinates for member stars, colored by their membership probability

In [None]:
x, y, col = 'GLON', 'GLAT', 'probs'
scatter_plot(x, y, col)

And a color-magnitude diagram

In [None]:
x, y, col = 'BP-RP', 'Gmag', 'probs'
scatter_plot(x, y, col, True)

Distribution of proper motions

In [None]:
x, y, col = 'pmRA', 'pmDE', 'probs'
scatter_plot(x, y, col)

Histogram of the parallaxes with the median value of the selected members shown as a red vertical line

In [None]:
histo_plot('Plx')