# Introduction

This notebook presents the first stage of a broader project focused on analyzing carbon stars in the Galactic plane. The goal of this stage is to identify the closest stellar counterparts to a set of five known carbon stars, using positional data from near-infrared catalogs.

The data was kindly provided by **PhD. David Merlo** and includes the following components:

- `Merlo2015.pdf`: an article explaining the scientific context of the project.
- Multiple `.asc` files: raw infrared source catalogs sorted by sky coordinates. Each file corresponds to observations in a specific photometric filter (Ks, H, J, Y, Z).
- `lista.cat`: a catalog summary file indicating the filter and observation date associated with each `.asc` file.
- `fuentes.txt`: a list of five carbon stars (labeled s0 to s4) with their J2000 coordinates. These are the reference targets for this study.
- Additional test files (e.g., `Ks20s0`, `Ks20s1`, etc.): preliminary measurements for visual inspection.

The main objective of this notebook is to:
1. Load and process each `.asc` catalog file.
2. Filter the sources to retain only stellar-type objects.
3. Convert catalog coordinates to a uniform astrometric format.
4. Compute the angular separation between each source and the five carbon stars.
5. Identify and save the closest match per star per catalog.
6. Combine all results into a unified dataset for future light curve construction.

This positional filtering and source selection step lays the foundation for the next notebook, where the temporal behavior of these stars will be modeled.


# 1. Previewing a Sample `.asc` Catalog

This block is used to load a single `.asc` file from the local directory and inspect its structure.  
It reads the data into a DataFrame and displays the first few rows to verify column formatting and content.


In [None]:
# Execution Protection: prevents accidental execution   

run_example = False 

if run_example:

    # CODE: 
    
    #from google.colab import drive
    #drive.mount('/content/drive')
    
    #import os
    #os.listdir('/content')
    
    import pandas as pd
    import os

    # Load a sample .asc file from the local environment
    directory = '/content'
    example_file = [f for f in os.listdir(directory) if f.endswith('.asc')][0]
    df_example = pd.read_csv(os.path.join(directory, example_file), delim_whitespace=True, header=None)

    # Display the first few rows to inspect structure
    df_example.head()

# Execution Protection: prevents accidental execution  
else:
    print("Execution blocked")


### Result:


| Row | Col0 | Col1 | Col2    | Col3 | Col4 | Col5  | Col6    | Col7  | Col8    | Col9   | Col10 | Col11 | Col12 | Col13  |
|-----|------|------|---------|------|------|-------|---------|-------|---------|--------|-------|-------|-------|--------|
| 0   | 12   | 15   | 53.918  | -64  | 12   | 31.52 | 5927.36 | 3.00  | 15.974  | 0.084  | 1     | 1     | 0.74  | -6.51  |
| 1   | 12   | 16   | 14.727  | -63  | 56   | 20.69 | 3056.21 | 3.24  | 16.009  | 0.085  | 1     | 1     | 0.48  | -0.82  |
| 2   | 12   | 16   | 2.676   | -64  | 5    | 45.83 | 4727.49 | 3.30  | 16.401  | 0.120  | 1     | -1    | 0.38  | -5.33  |
| 3   | 12   | 16   | 2.184   | -64  | 6    | 8.31  | 4793.99 | 3.16  | 16.090  | 0.091  | 1     | 1     | 0.31  | 173.46 |
| 4   | 12   | 15   | 54.633  | -64  | 11   | 59.09 | 5831.36 | 3.31  | 16.049  | 0.089  | 1     | 1     | 0.44  | 170.52 |


# 2. Processing Individual `.asc` Catalogs

Each of the 48 `.asc` catalogs was processed individually due to memory and runtime constraints.

#### Processing Strategy
- Only rows classified as **stellar sources** (`type = -1`) were selected for further analysis.
- The coordinates (RA and Dec, in components across columns 0–5) were converted to `SkyCoord` objects for precise angular calculations.
- For each catalog, the closest source was identified relative to the fixed J2000 coordinates of the five carbon stars listed in `fuentes.txt`.
- A dedicated code block was used to process each `.asc` file individually. An example is shown below.

#### Column Mapping
- **Columns 0–5**: Positional coordinates (RA/Dec, broken into components).
- **Column 8**: Magnitude (in a specific infrared filter).
- **Column 9**: Magnitude error.
- **Column 11**: Source type:


In [None]:
# Execution Protection: prevents accidental execution   

run_example = False 

if run_example:

    # CODE: 
    import pandas as pd
    from astropy.coordinates import SkyCoord
    import astropy.units as u
    import os
    from google.colab import files

    # Flag to avoid accidental re-execution
    execution_protection_flag = False

    # Function to process a single .asc file
    def process_asc_file(file_path, carbon_stars):
        df = pd.read_csv(file_path, delim_whitespace=True, header=None)
    
        # Some files may lack column 7, we check for enough columns
        if df.shape[1] >= 14:
            # Select relevant columns
            df_filtered = df[[0, 1, 2, 3, 4, 5, 8, 9, 11]].copy()

            # Filter only stellar sources (type -1)
            df_stars = df_filtered[df_filtered[11] == -1]

            # Convert coordinates to SkyCoord
            df_stars['coord'] = df_stars.apply(lambda row: convert_to_skycoord(row), axis=1)

            # Find closest source to each carbon star
            closest_stars = []
            for key, star in carbon_stars.items():
                star_coord = SkyCoord(ra=star['ra_dec'] * u.deg, dec=star['dec_dec'] * u.deg)
                closest_star = find_closest_star(star_coord, df_stars)
                closest_star['carbon_star_key'] = key
                closest_stars.append(closest_star)

            return closest_stars
        else:
            print(f"File {file_path} ignored: not enough columns.")
            return []

    # Convert row-based coordinates to SkyCoord object
    def convert_to_skycoord(row):
        ra = f"{int(row[0])}h{int(row[1])}m{row[2]}s"
        dec_sign = '-' if float(row[3]) < 0 else '+'
        dec = f"{dec_sign}{abs(int(row[3]))}d{int(row[4])}m{row[5]}s"
        return SkyCoord(ra, dec, frame='icrs')

    # Find the closest star to a given coordinate
    def find_closest_star(star_coord, df):
        df['separation'] = df['coord'].apply(lambda x: star_coord.separation(x).arcsecond)
        return df.loc[df['separation'].idxmin()]

    # Carbon stars data (J2000)
    carbon_stars = {
        's0': {'name': '1215-6420', 'ra': '12 15 49.6', 'dec': '-64 20 36'},
        's1': {'name': '1216-6420', 'ra': '12 15 58.5', 'dec': '-64 20 37'},
        's2': {'name': '1219-6423', 'ra': '12 19 16.2', 'dec': '-64 23 14'},
        's3': {'name': '1224-6413', 'ra': '12 24 27.0', 'dec': '-64 13 15'},
        's4': {'name': '1228-6427', 'ra': '12 28 19.4', 'dec': '-64 27 24'}
    }

    # Convert RA/Dec to decimal degrees
    for star in carbon_stars.values():
        coord = SkyCoord(f"{star['ra']} {star['dec']}", unit=(u.hourangle, u.deg))
        star['ra_dec'] = coord.ra.deg
        star['dec_dec'] = coord.dec.deg

    # Execution confirmation
    if not execution_protection_flag:
        confirmation = input("Are you sure you want to run the processing? (y/n): ")
        if confirmation.lower() == 'y':
            # Upload .asc file
            uploaded = files.upload()

            # Process each uploaded file
            for filename in uploaded.keys():
                print(f"Processing file: {filename}")
                closest_stars = process_asc_file(filename, carbon_stars)

                # Save results to CSV
                if closest_stars:
                    output_filename = f"{os.path.splitext(filename)[0]}_results.csv"
                    df_closest = pd.DataFrame(closest_stars)
                    df_closest.to_csv(output_filename, mode='w', header=True, index=False)
                    files.download(output_filename)

                # Print results
                for key, value in carbon_stars.items():
                    print(f"Closest star to {key} ({carbon_stars[key]['name']}):")
                    result = [star for star in closest_stars if star['carbon_star_key'] == key]
                    if result:
                        print(result[0])
                    else:
                        print("No result found.")

            execution_protection_flag = True
        else:
            print("Execution cancelled.")
    else:
        print("Code already executed once. Execution protection is active.")

# Execution Protection: prevents accidental execution  
else:
    print("Execution blocked")

# 3. Merging Individual Results

After processing all catalogs individually, this block:
- Uploads each CSV with results
- Merges them into one consolidated DataFrame
- Exports the final table to `combined_results.csv`

This table contains, for each carbon star in each catalog:
- Filename
- Closest source’s magnitude and error

This merged dataset will be used in the next step to select the best overall match for each star.


In [None]:
# Execution Protection: prevents accidental execution   

run_example = False 

if run_example:

    # CODE: 
    import pandas as pd
    import io
    from google.colab import files

    # Upload the individual result files
    uploaded = files.upload()

    # Combine the files into a single DataFrame
    df_list = []
    for file_name in uploaded.keys():
        df = pd.read_csv(io.BytesIO(uploaded[file_name]), header=0)
        df_list.append(df)

    df_combined = pd.concat(df_list, ignore_index=True)

    # Save the combined DataFrame to a CSV file for review
    df_combined.to_csv('combined_results.csv', index=False)
    files.download('combined_results.csv')

    print("Combined results saved as 'combined_results.csv'")

# Execution Protection: prevents accidental execution  
else:
    print("Execution blocked")

# Final Selection

Finally, the combined results file was processed to identify the overall closest match for each carbon star.  
The coordinates and the minimum angular separation for each carbon star were calculated and stored in a final DataFrame.


In [None]:
# Execution Protection: prevents accidental execution   

run_example = False 

if run_example:

    # CODE:

    import pandas as pd
    from astropy.coordinates import SkyCoord
    import astropy.units as u
    from google.colab import files

    # Upload the combined results file
    uploaded = files.upload()

    # Get the uploaded file name
    file_path = list(uploaded.keys())[0]
    print(f"Uploaded file: {file_path}")

    # Load the combined CSV containing results from all 48 files
    df_combined = pd.read_csv(file_path)

    # Function to parse 'coord' column back into SkyCoord objects
    def parse_skycoord(coord_str):
        try:
            # Clean the string and extract RA and Dec values
            coord_str = coord_str.replace('<SkyCoord (ICRS): (ra, dec) in deg\n    (', '').replace(')>', '')
            ra_dec_list = coord_str.split(',')

            ra = float(ra_dec_list[0].strip())
            dec = float(ra_dec_list[1].strip())
            return SkyCoord(ra=ra, dec=dec, unit='deg', frame='icrs')
        except Exception as e:
            print(f"Error parsing coord: {coord_str} -> {e}")
            return None

    # Convert the 'coord' column into SkyCoord objects
    df_combined['coord'] = df_combined['coord'].apply(parse_skycoord)

    # Drop rows with invalid coordinates
    df_combined = df_combined.dropna(subset=['coord'])

    # Carbon star reference data
    carbon_stars = {
        's0': {'name': '1215-6420', 'ra': '12 15 49.6', 'dec': '-64 20 36'},
        's1': {'name': '1216-6420', 'ra': '12 15 58.5', 'dec': '-64 20 37'},
        's2': {'name': '1219-6423', 'ra': '12 19 16.2', 'dec': '-64 23 14'},
        's3': {'name': '1224-6413', 'ra': '12 24 27.0', 'dec': '-64 13 15'},
        's4': {'name': '1228-6427', 'ra': '12 28 19.4', 'dec': '-64 27 24'}
    }

    # Convert RA/Dec to decimal degrees
    for star in carbon_stars.values():
        coord = SkyCoord(f"{star['ra']} {star['dec']}", unit=(u.hourangle, u.deg))
        star['ra_dec'] = coord.ra.deg
        star['dec_dec'] = coord.dec.deg

    # Function to find the closest match from a DataFrame
    def find_closest_star(star_coord, df):
        df['separation'] = df['coord'].apply(lambda x: star_coord.separation(x).arcsecond)
        closest_star = df.loc[df['separation'].idxmin()]
        return closest_star

    # Find the overall closest star for each carbon star
    general_closest_stars = {}
    for key, star in carbon_stars.items():
        star_coord = SkyCoord(ra=star['ra_dec']*u.deg, dec=star['dec_dec']*u.deg)
        general_closest_stars[key] = find_closest_star(star_coord, df_combined)

    # Print the results
    for key, value in general_closest_stars.items():
        print(f"Closest star to {key} ({carbon_stars[key]['name']}):")
        print(value)

    # Save final results to CSV
    output_filename = 'final_results.csv'
    results_df = pd.DataFrame(general_closest_stars).T
    results_df.to_csv(output_filename, index=False)

    # Download the CSV file
    files.download(output_filename)

    print(f"Final results saved as '{output_filename}'")

# Execution Protection: prevents accidental execution  
else:
    print("Execution blocked")

# Final Results Table

| RA (h) | RA (m) | RA (s)  | Dec (°) | Dec (') | Dec (") | Mag    | Error | Type | Coordinates                   | Separation (arcsec) | Carbon Star |
|--------|--------|---------|----------|----------|-----------|--------|-------|------|-------------------------------|----------------------|--------------|
| 12     | 15     | 49.609  | -64     | 20      | 35.77     | 12.124 | 0.01  | -1   | (183.95670417, -64.34326944) | 0.2373               | s0           |
| 12     | 15     | 58.496  | -64     | 20      | 37.36     | 11.935 | 0.01  | -1   | (183.99373333, -64.34371111) | 0.3609               | s1           |
| 12     | 19     | 16.168  | -64     | 23      | 14.34     | 11.444 | 0.01  | -1   | (184.81736667, -64.38731667) | 0.3983               | s2           |
| 12     | 24     | 26.992  | -64     | 13      | 14.62     | 10.964 | 0.01  | -1   | (186.11246667, -64.22072778) | 0.3836               | s3           |
| 12     | 28     | 19.414  | -64     | 27      | 24.08     | 10.839 | 0.01  | -1   | (187.08089167, -64.45668889) | 0.1208               | s4           |
