 # Chapter 1: Creating SAS Tables      
**Three Possible Modes of Creation**       

   1-Importing existing data from common formats (e.g., Excel).      
   2-Directly entering data into a SAS program.     
   3-Reading external data in ASCII format.     

I - Importing Data from Excel Format : 
The PROC IMPORT procedure allows importing Excel files into SAS.

In [None]:
proc import
    out=test
    datafile="c:\documents\test.xls"
    dbms=xls replace;
    sheet="Sheet1";
    getnames=yes;
    range="Sheet1$A1:C12";
    mixed=yes;
run;

**Important Points**     

Each SAS statement must end with a semicolon (;).       
SAS accepts both uppercase and lowercase code.        
Before importing Excel data, ensure:       
  - Missing values are represented by empty cells.
  - Decimal values use commas.     
  - Variable names are on the first row.

**Explanation of Parameters**   
 - OUT specifies the name of the output SAS table.      
 - DBMS indicates the file format (e.g., xls for Excel versions 2003, 2007, or 2010).          
 - REPLACE overwrites any existing SAS table with the same name.       
 - SHEET specifies the sheet name to import.      
 - GETNAMES=YES uses the first row as variable names.     
 - RANGE specifies the portion of the sheet to import.     
 - MIXED=YES allows importing mixed-type variables, treating them as character type.   

**Alternative: Using the Import Wizard**      
 1 - From the main menu, choose File -> Import Data....      
 2 - Follow the steps in the wizard:      
  - Click "Next" in the first window.     
  - In the "Connect to MS Excel" window, browse to select the Excel file.     
  - Select the sheet and click "OK."     
  - Specify the library (permanent or temporary) where the SAS table will be stored.     
  - Assign a name to the SAS table.     
  - Click "Finish."      

**Storing SAS Tables**    
 1 - Temporary Library (WORK):    
   - Contents are deleted when you exit SAS.    
 2 - Permanent Library:    
   - Define using the LIBNAME statement, e.g.,   

In [None]:
libname in "d:\mathieu\data\";

 - This assigns a logical name (in) to a physical directory.

**Clearing Libraries**    
 - To remove a library:

In [None]:
libname in clear;
run;

- To remove all libraries:

In [None]:
libname clear _all_;
run;

II - **Direct Data Entry**

- For small datasets, directly entering data is quicker.

In [None]:
data in.test;
    input store $ sales advertising;
    datalines;
    Auchan 164 34
    Carrefour 138 36
    Lidl 85 179
    Franprix 168 45
    Aldi 201 67
    ;
run;

*Notes*         

1 - In the DATA step, DATALINES (or CARDS) is used for manual data entry.        
2 - Variables must be defined:     
  - Use dollar sign for character variables.     
  - Numeric variables do not require special notation.       
3 - Always end the DATA step with a RUN statement.     
4 - Use LIST before DATALINES to print the input data in the SAS log for debugging.          
5- Missing values are represented by a period (.).          
6 - Use & with $ to allow spaces in character values.

**III - Reading External Data**
  - Fixed-Column Format

In [None]:
data in.test;
    infile "d:\mathieu\data\test.txt";
    input store $1-9 sales 10-15 advertising 16-22;
run;

- INFILE specifies the external file path.              
- INPUT defines variable positions within the file.

**Delimited Format**

In [None]:
data in.test;
    infile "d:\mathieu\data\test.txt" dlm=';';
    input store $ sales advertising;
run;

 - Use DLM to specify delimiters (e.g., ;, ,, or space).     
 - Ensure consistent formatting to avoid parsing errors.

*Validation*

1 - Check the SAS log.

2 - Use PROC PRINT or PROC CONTENTS:

In [None]:
proc print data=in.test;
run;

proc contents data=in.test;
run;

**IV - Characteristics of SAS Tables and Variables**    
1 - SAS tables are stored as files with the .sas7bdat extension.            
2 - SAS programs use the .sas extension, while logs and outputs use .log and .lst, respectively.         
3 - SAS variables:             
 - Numeric Variables: Support decimals (e.g., 5.31, 1E3).                       
 - Character Variables: Maximum length is 32,767 characters.         
 - Missing numeric values are represented as .; missing character values are blank ('').                           

4- Naming Rules:          
  - Names must be 1-32 characters long.                    
  - Begin with a letter or underscore (_).                           
  - Avoid special characters and reserved names (e.g., _ALL_, _NUMERIC_).                     