### GWU STAT 4197/ STAT 6197

#### Week 4, Part 1: SAS Functions (Code Examples)
#### <span style='color:blue'> SAS Functions and Variable Type Conversions  </span>


#### A SAS Function is a routine that returns a value based on what is secified in the arguments in a DATA step or in the %SYSFUNC macro function

* SAS Functions by Category
    * Character  [e.g., SUBSTR(), SCAN(), CATX()]
    * Date and Time [e.g., Today(), Date(), Year(), QTR(), Month(), Day(), Week(), Weekday(), Time(),     Timepart()]
    * Truncation [e.g., ROUND(), CEIL(), FLOOR(), INT()]
    * Descriptive Statistics [e.g., Mean(), Median(), Max(), Min(), N(), NMISS(), CMISS()]
    * Special



### SUBSTR Function

The SUBSTR is best used when you know the exact position of the substring from the character value.  You specify the 

* variable name 
* starting position
* number of characters to extract


[Inserting a substring into a SAS string by Leonid Batkhan - A must-read article from SAS Blogs](https://blogs.sas.com/content/sgf/2021/02/15/inserting-a-substring-into-a-sas-string/?utm_source=feedburner&utm_medium=feed&utm_campaign=Feed%3A+TheSasTrainingPost+%28The+SAS+Learning+Post+-%3E+SAS+Users%29)

In [2]:
*Ex1_substr.sas (Part 2);
options nosource nonotes nodate nonumber;
ods html close;
 data _Null_;
   var1 = 'Stat4197';
   new_var1 = SUBSTR(var1,5);
   new_var3= SUBSTR(var1,5,4);
 putlog (_ALL_) (=// +2);
 run;


### SUBSTR Function (Left Side)
* can also be used to replace characters in a character variable

In [60]:
*Ex1_substr.sas (Part 1);
options nocenter nodate nonumber nonotes nosource;
ods exclude all;
data _Null_;
   var1 = 'Geology';
   new_var1 = SUBSTR(var1,1,3);
   new_var2=var1;
   SUBSTR(new_var2,1,3)='Zoo';
putlog (_ALL_) (=// +2);
 run;


### SCAN Function
* used to extract words from a character 
    * when you know the order of the words
    * when their position varies
    * when the words are marked by some delimiter
* the blank and the comma are default delimiters

#### Code Explanation for the last assignment statement

* The use of the ‘Q’ modifier alone as the fourth argument causes the SCAN function to ignore the word delimiters within quoted strings. 

* The use of the ‘R’ modifier along with the ‘Q’ modifier enables you to correctly separate the words and removes the quotes from the two quoted words.  
(Carpenter, 2012)


In [2]:
*Ex2_scan.sas (Part 1);
options nonotes nodate nonumber nosource;
ods html close;
options nocenter nodate nonumber nonotes nosource;
data _Null_;
   var1 = 'United States, Washington DC';
   var2 ='Tim Johnson';
   var3 = " 'Silver Spring, MD 20906', 'Columbia, MD 19204'" ;
   get_country=scan(var1,1,',');
   get_city=scan(var1,2,',');
   get_LastName= scan(var2,2);
   get_CityZip1_qr = scan(var3,1, ',', 'qr'); 
putlog (_ALL_) (=//+2);
run;

In [28]:
*Ex2_scan.sas (Part 2);
options nocenter nodate nonumber;
data in_a;
input @1 string $char65.;
dbegin=scan(string,2,'.');
datalines;
'C:\My Files\GWU\SAS\.09202018.read_data.txt'
;
proc print; run;

Obs,string,dbegin
1,'C:\My Files\GWU\SAS\.09202018.read_data.txt',9202018


## CATX Function
* removes leading and trailing blanks and inserts separator


In [31]:
*Ex3_CATX.sas (Part 1);
options nocenter nodate nonumber nonotes nosource;
ods exclude all;
data _null_;
  Length con: $12 v_catx $35; 
  con1 = 'Hypertension';
  con2 = 'Stroke';
  con3 = 'Diabetes';
  separator=','; 
   v_catx = CATX(separator, of con1-con3);
   putlog v_catx = ;
run;

### CATX and IFC Functions

In [29]:
*Ex4_CATX_IFC.sas;
options nocenter nodate nonumber;
data try ;
input (ID R SAS Python SPSS Stata) ($);
list = CatX(', ', 
               IfC( R = 'Yes' , 'R' , '' ),
               IfC( SAS = 'Yes', 'SAS' , '' ),
               IfC( Python = 'Yes' , 'Python' , '' ),
               IfC( SPSS = 'Yes' , 'SPSS' , ''),
               IfC( Stata = 'Yes' , 'Stata' , '' ));
datalines;
ID01 Yes Yes No No No
ID02 No Yes No No Yes
ID03 No Yes Yes No No
ID04 Yes Yes Yes No No
;
proc print noobs ; run;


ID,R,SAS,Python,SPSS,Stata,list
ID01,Yes,Yes,No,No,No,"R, SAS"
ID02,No,Yes,No,No,Yes,"SAS, Stata"
ID03,No,Yes,Yes,No,No,"SAS, Python"
ID04,Yes,Yes,Yes,No,No,"R, SAS, Python"


### COMPRESS and COMPBL Functions

* COMPRESS function removes unwanted characters from a string variable.
* COMPBL function replaces multiple blanks with single blanks.


In [30]:
*Ex6_compress_compbl.sas (Part 1);
data work.HAVE;
 input ICD_string $1-5 +1 label $40.;
   x_string=compress(icd_string, '.');
   x_label=compbl(label);
datalines;
S72.0 Fracture of  head and  neck  of   Fumer
 ;
proc print data=work.HAVE noobs; 
run;

ICD_string,label,x_string,x_label
S72.0,Fracture of head and neck of Fumer,S720,Fracture of head and neck of Fumer


# COMPRESS Function with the kd modifier
* removes from the character string (specified in the first argument) the dash that is specified in the second argument
* The kd modifier in the third argument tells SAS to keep digits from the character string that is specified in the first argument. 



In [12]:
*Ex6_compress_compbl.sas (Part 2);
data work.HAVE1; 
input ID $ 11. ;
Var_remove_dash=COMPRESS(ID, "-");
Var_keep_Digit=COMPRESS(ID," ", "kd");  * kd means keep-digits;
datalines;
301-538-0234
;
proc print data=work.Have1; run;

proc contents data=work.Have1 p;
ods select position;
run;

Obs,ID,Var_remove_dash,Var_keep_Digit
1,301-538-023,301538023,301538023

Variables in Creation Order,Variables in Creation Order,Variables in Creation Order,Variables in Creation Order
#,Variable,Type,Len
1,ID,Char,11
2,Var_remove_dash,Char,11
3,Var_keep_Digit,Char,11


### INDEX and UPCASE Functions

In [8]:
* Ex7_index_find.sas;
* Contributed by Nat Wooding and data_null to SAS-L;

 Data Conditions;
   input @1 Condition $22.;
 datalines;
 heart failure 
 early heart failure
 failure of the heart
;
data Have;
  set Conditions;

 if index(UPCASE(Condition), "HEART FAILURE") gt 0 then HF1 = 1; 
       else HF1 =0;

 HF2 = ( INDEX(UPCASE(Condition), "HEART FAILURE") gt 0 ) ;

 HF3 = find(Condition,'heart failure','I') gt 0; 
run;
proc print data=Have noobs; run;

Condition,HF1,HF2,HF3
heart failure,1,1,1
early heart failure,1,1,1
failure of the heart,0,0,0


### TRANSLATE and TRANWRD Functions

* TRANSLATE handles character replacement for single-byte character sets only. 

* The TRANWRD function differs from TRANSLATE in that it scans for words (or patterns of characters) and replaces those words with a second word (or pattern of characters).” 

SAS Documentation.


In [9]:
*Ex8_tranwrd_translate.sas;
options nocenter nodate nonumber nonotes nosource;
ods exclude all;
data _NULL_;
  date1='12/31/2010';
  date1_translate = translate(date1, '-', '/');
  txt = 'Data from surveys';
  txt_tranwrd = tranwrd(txt, 'surveys', 'records');
putlog (_ALL_) (= // +2);
run;


## Date and Time Functions


In [10]:
*Ex9_Date_Function.sas (Part 1);
options nocenter nodate nonumber nonotes nosource;
Data _Null_;
 date_time = '13Jan2016:23:15:30'dt;
date_part=datepart(date_time);
time_part=timepart(date_time);
month = month(date_part);
weekday=weekday(date_part);
day=day(date_part);
year=year(date_part);
new_date=mdy(1,27,2016);
Today = date();
Today_x = today();
Quiz1_date ='23Sep2016'd;
System_date="&sysdate"d;
System_date_x="&sysdate9"d;
format date_time datetime20. 
       date_part new_date Today  Today_x 
       Quiz1_date System_date
       System_date_x date9. 
       time_part time.;
putlog (_ALL_)  (=/ +2);
run;

In [1]:
*Ex9_Date_Function.sas (Part 2);
data work.have;
input @1 date_time1 anydtdtm21.
      @1 date1 anydtdte15.
      @11 time1 anydttme13.;

      * Create new variables;
      date_time2 = date_time1; 
      time2=timepart(date_time1);  

format date_time1 datetime18.
       date1 date9.
       time1  time5. 
       time2 timeampm11.
       date_time2 dateampm22.2;
datalines4;
01Jul2016:15:30:55
;
proc print data=work.have noobs;
var date1 date_time1 date_time2 time1 time2;
run;

SAS Connection established. Subprocess id is 2296



date1,date_time1,date_time2,time1,time2
01JUL2016,01JUL16:15:30:55,01JUL16:03:30:55.00 PM,15:30,3:30:55 PM


## INTCK Function
* returns the number of intervals between two dates
* has the first argument that specifies the unit of interval (e.g., years, months, days, weeks, days)

The INTCK function returns the integer count of the number of interval boundaries between two dates, two times, or two datetime values.

In [2]:
*Ex10_INTCK_Function.sas;
options nocenter nodate nonumber;
data count_interval;
  years=intck('year','01jan2009'd,'01jan2010'd);
  SEMIYEAR=intck('SEMIYEAR','01jan2009'd,'01jan2010'd);
  quarters=intck('qtr','01jan2009'd,'01jan2010'd);
  months=intck('month','01jan2009'd,'01jan2010'd);
  weeks=intck('week','01jan2009'd,'01jan2010'd);
  days=intck('day','01jan2009'd,'01jan2010'd);
run;
proc print data=count_interval noobs; 
run;


years,SEMIYEAR,quarters,months,weeks,days
1,2,4,12,52,365


### INTNX Function
* returns a date that is some number of intervals away
* has the first argument that specifies the unit of interval (e.g., years, months, weeks, days)

Use this function to advance a date, say month(s), into the future using the following alignment position
* Beginning (b) interval start
* Middle (m) interval center
* End (e) interval end
* Same (s) relative position as the initial interval


In [2]:
*Ex17_Advance_Dates_INTNX.sas;
options nocenter nonumber nodate;
data work.Have;
some_date='31Aug2018'D;
next_default_fw=intnx('month',some_date,2);
next_default_bw=intnx('month',some_date, -2);
next_b=intnx('month',some_date,2, 'beginning');
next_m=intnx('month',some_date,2, 'middle');
next_e=intnx('month',some_date,2, 'end');
next_s=intnx('month',some_date,2, 'same');
format some_date next: date9.;
run;
title1' Advancing Dates';
proc print data=work.Have noobs; run;
title1;


some_date,next_default_fw,next_default_bw,next_b,next_m,next_e,next_s
31AUG2018,01OCT2018,01JUN2018,01OCT2018,16OCT2018,31OCT2018,31OCT2018


### YRDIF Function
* returns the difference in years between two dates depending on the alignment position that is specified in the third argument.

In [4]:
*Ex14_YRDIFF.sas;
options nocenter nonumber nodate nosource;
ods exclude all;
 data _null_;
 SDATE= '10jan2013'd;
 EDATE='10jul2017'd;
 N_DAYS=EDATE-SDATE;
 Y_age=yrdif(SDATE, EDATE, 'AGE');
 Y30360=yrdif(SDATE, EDATE, '30/360');  
 YACTACT=yrdif(SDATE, EDATE, 'ACT/ACT'); 
 YACT360=yrdif(SDATE, EDATE, 'ACT/360'); 
 YACT365=yrdif(SDATE, EDATE, 'ACT/365'); 
  format SDATE EDATE date9.
        N_DAYS Y30360 YACTACT YACT360 YACT365
         best.;
putlog (_ALL_) (= / +2);
run;


### PUT and INPUT Function

[Understand the PUT and INPUT functions in SAS by Rick Wicklin](https://blogs.sas.com/content/iml/2024/10/14/put-and-input-functions.html)

* An informat reads a text string and converts it to a data value that is easier 
  to work with or analyze.

* A format converts a data value to a textual representation that is (hopefully!)
  easier to read and interpret.

[Converting variable types—use PUT() or INPUT()? by Sunil Gupta](https://blogs.sas.com/content/sgf/2015/05/01/converting-variable-types-do-i-use-put-or-input/)

"The answer to the question "Do I use PUT() or INPUT()?" depends on what your target variable type is and what your source variable type and data are. Below are three questions to consider:

* Is your target variable character or numeric?

* Is your source variable character or numeric?

* If your source variable is character, is your data value character or numeric?

Based on your answers to the three questions above, you can identify whether PUT() or INPUT() comes first. Keep these four rules in mind when writing your SAS statements:

* PUT() always creates character variables

* INPUT() can create character or numeric variables based on the informat

* The source format must match the source variable type in PUT()

* The source variable type for INPUT() must always be character variables"

In [8]:
*Ex11A_put_function.sas;
options nocenter nonumber nonotes nosource; 
proc format;
 value $stypeF  U='Undergraduate'  G='Graduate' ;
 value stypeF  1='Undergraduate'  2='Graduate' ;
run;  
data put_function_data;
    c_stype ='U';  
    f_stype = put(c_stype, $stypeF.);
    
    n_stype=2;
    f_stype_n = put(n_stype, stypeF.);
    
    n_id = 12345678; 
    c_id =put(n_id, 8.);
    
    n_amount = 23500; 
    c_amount = put(n_amount, dollar7.);
    
    SAS_date_value = 1357;    
    c_date = put(SAS_date_value, Weekdate.);
    
putlog (_ALL_) (=/ +2);
run;



The SAS System                                                                                        12:56 Friday, February 7, 2025


  c_stype=U
  f_stype=Undergraduate
  n_stype=2
  f_stype_n=Graduate
  n_id=12345678
  c_id=12345678
  n_amount=23500
  c_amount=$23,500
  SAS_date_value=1357
  c_date=Thursday, September 19, 1963

The SAS System                                                                                        12:56 Friday, February 7, 2025

E3969440A681A2408885998500000010


In [4]:
*Ex11C_put_input_function.sas (Part 1);
options nocenter nodate nonumber;
 data have1;
  do i=1 to 5;
    Num_in_words=put(i,words12.);
    output;
  end;
 run;
 proc print data=have1 noobs; run;


i,Num_in_words
1,one
2,two
3,three
4,four
5,five


In [1]:
*Example_put_align_values.sas;
options nocenter nodate nonumber;
data have;
  prevalence = 23.05; SE=1.9845; output;
  prevalence = 3.05; SE=0.1845; output;
run; 
data want;
set have;
  cattdvar= catx(' ', put(prevalence, 5.1), 
                  cats( '(',put(SE, 4.2),')' )
                 );
  xcattdvar= put(cattdvar,$11. -r);
run;
proc print data=want noobs; run;

SAS Connection established. Subprocess id is 9460



prevalence,SE,cattdvar,xcattdvar
23.05,1.9845,23.1 (1.98),23.1 (1.98)
3.05,0.1845,3.1 (0.18),3.1 (0.18)


#### In the code below, we do the following:
* Convert a character variable into numeric variable (INPUT function)
* Convert the same character variable into a numeric format (INPUT function) and then to a new character variable (PUT function)
* Display the values of this character (formated SAS date) variable
* Examine the attributes of the variables created in the DATA step

In [3]:
*Ex11C_put_input_function.sas (Part 2);
 data have2;
  do chars='1112.80', '112.81', '2.83', '0.84';
    Char_to_num = input(chars,F5.2);
    Char_to_num_char=put(input(chars,F5.2), dollar9.2);
    output;
  end;
 run;
 proc print data=have2 noobs; run;



chars,Char_to_num,Char_to_num_char
1112.8,1112.0,"$1,112.00"
112.81,112.8,$112.80
2.83,2.83,$2.83
0.84,0.84,$0.84


### How to Convert a numeric date variable containing calendar dates into the character format (PUT function) and then to a new numeric variable containing SAS dates (INPUT function)?

* Apply a date format to the numeric variable in the DATA step
* Display the values of this numeric (SAS date) variable using PROC PRINT
* Examine the attributes of all variables created using PROC CONTENTS

In [4]:
*Ex11C_put_input_function.sas (Part 3);
 data have3;
  do mon_day_yr =11318, 121018, 122418;
      mon_day_yr_SAS=input(put(mon_day_yr,6.),mmddyy6.);
format mon_day_yr_SAS date9.;
    output;
  end;
 run;
 proc print data=have3 noobs; run;


mon_day_yr,mon_day_yr_SAS
11318,13JAN2018
121018,10DEC2018
122418,24DEC2018


In [22]:
*Ex35_Arrays_to_assign_values (Part 2);
*Contributed by Rick Wicklin to SAS-L - 1/5/2016;
options nocenter nodate nonumber nosource nonotes;
data _null_;
array x[3] (1, 2, 3);
x_sum=put( sum(of x[*]), Z6.);
x_avg=put( mean(of x[*]), 5.1);
x_std=put( std(of x[*]), 5.3);
putlog _ALL_;
run;

### Working with SAS Date Values

#### In the code below, we do the following:

* Create a numeric variable that contains SAS date values
* Apply a date format to same numeric variable in the DATA step
* Display the values of that numeric variable using PROC PRINT
* Examine the attribute of the variable using  PROC CONTENTS


In [7]:
*Ex11C_put_input_function.sas (Part 4);
data have4;
  do stored_sas_date_value =11318, 121018, 122418;
      format stored_sas_date_value date9.;
    output;
  end;
 run;
 proc print data=have4 noobs; run;
 proc contents data=have4 varnum; 
 ods select position;
run;

stored_sas_date_value
27DEC1990
03MAY2291
03MAR2295

Variables in Creation Order,Variables in Creation Order,Variables in Creation Order,Variables in Creation Order,Variables in Creation Order
#,Variable,Type,Len,Format
1,stored_sas_date_value,Num,8,DATE9.


In [1]:
*Ex13_length_lengthc.sas;
options nocenter nonumber nodate;
data HAVE;
blank= ' ';
lengthc_of_dot=LENGTHC(dot);
length_of_dot=LENGTH(dot);

dot=.;
lengthc_of_blank=LENGTHC(blank);
length_of_blank=LENGTH(blank);
run;
title1 'Ex13_length_lengthc.sas';
proc print data=have; run;

proc contents data=have varnum;
ods select position;
run;
title1;


SAS Connection established. Subprocess id is 6288



Obs,blank,lengthc_of_dot,dot,length_of_dot,lengthc_of_blank,length_of_blank
1,,12,.,12,1,1

Variables in Creation Order,Variables in Creation Order,Variables in Creation Order,Variables in Creation Order
#,Variable,Type,Len
1,blank,Char,1
2,lengthc_of_dot,Num,8
3,dot,Num,8
4,length_of_dot,Num,8
5,lengthc_of_blank,Num,8
6,length_of_blank,Num,8


### COALESCEC() and COALESCE() functions

See Messinio, Martha. 2017. Practical Guide and Efficient SAS(R) Programming: The Insider's Guide. Cary, NC: SAS Institute Inc.

Page 11: "If you are coalescing character values in a DATA step, you must use the 
COALESCEC() function; the COALESCE() function is only for numeric values.
However, in PROC SQL, you can use COALESCE() for either numeric or character values." Messinio (2017).


In [8]:
*Ex15_COALESCEC_Function.sas (Part 2);

/*Create a data set for COALESCE() with PROC SQL below.*/

data Have;
input score1 score2;
datalines;
 . 15  
 .  . 
 17 13
 14  . 
 .  20
 14 19
;
run; 

proc sql; 
title1 'Coalesce() replaces column values';
select Monotonic() as obs, 
      score1, 
      coalesce(score1, 0) as _score1,
      case when score1=. then 0 else score1 end as score1x
      from Have;
title 'Coalesce() combines column values';
 select	  Monotonic() as obs,  score1, score2,
        coalesce(score1, score2) as combined_score
          from Have;
quit;
title1;

obs,score1,_score1,score1x
1,.,0,0
2,.,0,0
3,17,17,17
4,14,14,14
5,.,0,0
6,14,14,14

obs,score1,score2,combined_score
1,.,15,15
2,.,.,.
3,17,13,17
4,14,.,14
5,.,20,20
6,14,19,14


In [3]:
*Ex18_n_nmiss_cmiss.sas;
options nodate nonumber;
data work.HAVE;
infile datalines TRUNCOVER;
input A 1 B 3 C 5 D $ 7-11 E $ 13-16;
datalines;
7 1 1 SAS   Stat
8     SPSS  Econ
6 3 1 R     Math
  6 1       Soc 
  1   Stata Epi 
5 4 2       Stat
;
 
data work.WANT;
set work.HAVE;
Is_Missing_A= missing(A);
Is_Missing_D= missing(D);
Count_Missing_Num = nmiss(OF A--C);
Count_Missing_Both_Num_Char = cmiss(A,B,C,D,E);
Count_Nonmissing_Num = n(OF A--C);
run;
title1 'Handling missing values';
proc print data=WANT noobs split='*';
label Is_Missing_A ='Whether*Variable A *Missing'
      Is_Missing_D ='Whether*Variable D*Missing'
Count_Missing_Num='Missing*Values*for*Numeric*Variables'
Count_Missing_Both_Num_Char='Missing*Values*for*Num/Char*Variables'
Count_Nonmissing_Num= 'Nonmissing*Values*for*Numeric*Variables'
; 
run;
title1;


A,B,C,D,E,Whether Variable A Missing,Whether Variable D Missing,Missing Values for Numeric Variables,Missing Values for Num/Char Variables,Nonmissing Values for Numeric Variables
7,1,1,SAS,Stat,0,0,0,0,3
8,.,.,SPSS,Econ,0,0,2,2,1
6,3,1,R,Math,0,0,0,0,3
.,6,1,,Soc,1,1,1,2,2
.,1,.,Stata,Epi,1,0,2,2,1
5,4,2,,Stat,0,1,0,1,3


In [None]:
*Ex26_delete_rows_all_vars_missing (Part 2);
data have2;
set have1;
  if cmiss(of _all_) >0 then delete;
run;
title1 'Data Set Have2'; 
proc print data=have2; run;

### CALL MISSING Routine
[Five Simple Ways to Know If Variables in a Table Are All Missing - Xia Ke Shan, and Kurt Bremser SAS Global Forum 2020.](https://www.sas.com/content/dam/SAS/support/en/sas-global-forum-proceedings/2020/4737-2020.pdf)

In [49]:
*Ex26_delete_rows_all_vars_missing (Part 1);
data have1;
  set sashelp.class(obs=7);
  if mod(_n_,3)=0 then call missing(of _all_);
run;
title1 'Data Set Have1'; 
proc print data=have1; run;


Obs,Name,Sex,Age,Height,Weight
1,Alfred,M,14,69.0,112.5
2,Alice,F,13,56.5,84.0
3,,,.,.,.
4,Carol,F,14,62.8,102.5
5,Henry,M,14,63.5,102.5
6,,,.,.,.
7,Jane,F,12,59.8,84.5


### The Usage of COMPRESS and CATS Functions - An Application

In [53]:
*Ex26_delete_rows_all_vars_missing (Part 3);
*Posted by Quentin McMullen to SAS=L 9/25/2018;
data have3;
set have1;
if compress(cats(of _all_),'.')=' ' then delete;
run;
title1 'Data Set Have3'; 
proc print data=have3; run;
title1;

Obs,Name,Sex,Age,Height,Weight
1,Alfred,M,14,69.0,112.5
2,Alice,F,13,56.5,84.0
3,Carol,F,14,62.8,102.5
4,Henry,M,14,63.5,102.5
5,Jane,F,12,59.8,84.5



Acknowledgements:

PharmaSUG 2013 - Paper CC30
Useful Tips for Handling and Creating Special Characters in SAS®
Bob Hull, SynteractHCR, Inc., Carlsbad, CA
Robert Howard, Veridical Solutions, Del Mar, CA 


*Ex31_byte_function.sas; /*Code in Markdown to prevent execution*/

options nocenter notes nodate nonumber source;

  data _null_;
  
 do i=1 to 255;
 
  byte=byte(i);
  
   put i +10 byte;
   
 end;
 
 run; 


### How to convert a Character Variable Containing a Week Date into a SAS Date
### The Usage of INDEX, SUBSTR, and INPUT Functions 
Author: KurtBremser

According to the author, you need to remove the redundant weekday first with substr(), then  anydtdte will recognize the date - SAS Support Community Web Site- 04-23-2018


In [19]:
*Ex32_Week_date_problem.sas;

options nocenter nodate nonumber;
data test;
input datestr $30.;
i = index(datestr,' ');
date = input(substr(datestr,i),anydtdte30.);
*drop i;
format date weekdate30.;
cards;
Saturday March 31, 2018
;
run;
proc print data=test noobs; run;

datestr,i,date
"Saturday March 31, 2018",9,"Saturday, March 31, 2018"


In [1]:
*Ex38_SCAN_List_Files_Data_Step;
ods html close;
Filename filelist pipe "dir /b /s c:\SASCourse\Week4\*.sas";  
   Data Listfiles;         
    Length very_last_word $8;
     Infile filelist truncover;
     Input filename $100.;
     very_last_word=scan(filename, -1);
     
     * Last words from the reverse direction delimited by a slash;
     File_name=substr(scan(filename, -1, '\'),1);
     Run; 
proc sort data=Listfiles; by File_name; run;
proc print data=Listfiles;
var  File_name;
where very_last_word eq 'sas';
run;

SAS Connection established. Subprocess id is 6924



Obs,File_name
1,Ex10_INTCK_Function.sas
2,Ex11A_put_function.sas
3,Ex11B_put_align_values.sas
4,Ex11C_put_input_func.sas
5,Ex13_length_lengthc.sas
6,Ex14_YRDIFF.sas
7,Ex15_COALESCEC_Function.sas
8,Ex16_Array.sas
9,Ex17_Advance_Dates_INTNX.sas
10,Ex18__n__nmiss_cmiss.sas


The following code example has been obtained from https://communities.sas.com/t5/SAS-Programming/how-to-convert-char-var-to-sas-date/td-p/45067.

Convert a character value to a number
To convert a character value to a number, you use the INPUT function with a specified informat, which indicates how you want SAS to read the number.

The w value (width) in w.d must be large enough to include the character length of the largest value to read (including decimal separator). (The d value is optional.) The w.d informat is flexible enough to interpret decimal values as well as scientific notation.

In [20]:
options nocenter nodate nonumber nosource;
ods html close;

data _null_;
  char1 = '12345678';
  char2 = '123.456';
  char3 = '123e-4';
  num1 = input(char1, 8.);
  num2 = input(char2, 8.);
  num3 = input(char3, 8.);
  put char1= char2= char3=;
  put num1= num2= num3=;
run;


The SAS System

[38;5;21mNOTE: Writing HTML5(SASPY_INTERNAL) Body file: _TOMODS1[0m

char1=12345678 char2=123.456 char3=123e-4
num1=12345678 num2=123.456 num3=0.0123
[38;5;21mNOTE: DATA statement used (Total process time):
      real time           0.00 seconds
      cpu time            0.00 seconds
      [0m


The SAS System

E3969440A681A2408885998500000022


In [19]:
options nocenter nodate nonumber nosource;
ods html close;
data _null_;;
  char1 = '1,234,567';
  num1 = input(char1, comma9.);
  /* read again, but this time apply a COMMA9. format for display */
  num1_fmt = input(char1, comma9.);
  format num1_fmt comma9.;
  put num1= num1_fmt=;
run;


The SAS System

[38;5;21mNOTE: Writing HTML5(SASPY_INTERNAL) Body file: _TOMODS1[0m

num1=1234567 num1_fmt=1,234,567
[38;5;21mNOTE: DATA statement used (Total process time):
      real time           0.00 seconds
      cpu time            0.01 seconds
      [0m


The SAS System

E3969440A681A2408885998500000021


In [5]:
options nocenter nonumber nodate;
ods html close;
data _null_;
input numvar 1-2;
charvar = strip(put(numvar, 8.));
infile datalines firstobs=2;
put numvar= charvar=;
datalines;
123456
 7
69
34
12
;
run;



The SAS System

116        ods listing close;ods html5 (id=saspy_internal) file=_tomods1 options(bitmap_mode='inline') device=svg style=HTMLBlue;
116      ! ods graphics on / outputfmt=png;
[38;5;21mNOTE: Writing HTML5(SASPY_INTERNAL) Body file: _TOMODS1[0m
117        
118        options nocenter nonumber nodate;
119        ods html close;
120        data _null_;
121        input numvar 1-2;
122        charvar = strip(put(numvar, 8.));
123        infile datalines firstobs=2;
124        put numvar= charvar=;
125        datalines;

numvar=7 charvar=7
numvar=69 charvar=69
numvar=34 charvar=34
numvar=12 charvar=12
[38;5;21mNOTE: DATA statement used (Total process time):
      real time           0.00 seconds
      cpu time            0.01 seconds
      [0m

131        ;
132        run;
133        
134        
135        
136        ods html5 (id=saspy_internal) close;ods listing;
137        

The SAS System

138        


Data Type Conversion SAS

The following code example has been obtained from https://communities.sas.com/t5/SAS-Programming/how-to-convert-char-var-to-sas-date/td-p/45067.

The COMMAw.d informat is more versatile than its name implies. The COMMAw.d informat removes (not only) embedded commas, but also blank spaces, dollar signs, percent signs, hyphens, and close parentheses from the input data. The COMMAw.d informat converts an open parenthesis at the beginning of a field to a minus sign to interpret as a negative value.

In this example, the COMMA12. informat is used to convert several different styles of number expressions to SAS numeric values

In [8]:
data raw;
 length raw_val $ 12;
 infile datalines;
 input raw_val;
datalines;
123456
1234.56
1,234.56
$1,234.56
(1,234.56)
-1234.56
;
run;

data convert;
 set raw;
 num = input(raw_val,comma12.);
run;
proc print data=convert;
run;

Obs,raw_val,num
1,123456,123456.0
2,1234.56,1234.56
3,1234.56,1234.56
4,"$1,234.56",1234.56
5,"(1,234.56)",-1234.56
6,-1234.56,-1234.56


In [None]:
data _null_;


https://communities.sas.com/t5/SAS-Communities-Library/How-to-convert-a-character-value-to-numeric-in-SAS/ta-p/847645

Convert a character value to a SAS date or datetime

A SAS date is a numeric value that is valid for use with date functions and other mathematical operations. A SAS date might be formatted so that it contains characters in its display, but a SAS date is always stored as a number. Internally, SAS date is the number of days since January 1, 1960. Similarly, a SAS datetime is a number -- the number of seconds since midnight on January 1, 1960.

 

To convert a character value to the date value it represents, use in the INPUT function with one of the many date informats. This is example shows two common date formats: ddMONyyyy (or "DATE9"), and MM-DD-YYYY (or MMDDYY10.):

In [None]:
data dates;
 startdate = "12JUL2021";
 enddate = "07-30-2022";
 date_start = input(startdate,date9.);
 date_end = input(enddate,mmddyy10.);
 days_diff = date_end-date_start;
 format date_start date9. date_end date9.;
run;

https://communities.sas.com/t5/SAS-Communities-Library/How-to-convert-a-character-value-to-numeric-in-SAS/ta-p/847645

Tip: use the ANYDTDTE. informat to interpret a variety of date representations. See One informat to rule them all: Read any date into SAS. This program produces the same result as above:

In [9]:
data dates;
 startdate = "12JUL2021";
 enddate = "07-30-2022";
 date_start = input(startdate,anydtdte12.);
 date_end = input(enddate,anydtdte12.);
 days_diff = date_end-date_start;
 format date_start date9. date_end date9.;
run;
proc print data=dates; run;

Obs,startdate,enddate,date_start,date_end,days_diff
1,12JUL2021,07-30-2022,12JUL2021,30JUL2022,383


https://communities.sas.com/t5/SAS-Communities-Library/How-to-convert-a-character-value-to-numeric-in-SAS/ta-p/847645

In [10]:
data dt;
 raw='2022-11-23T17:32:35Z';
 val = input(raw,anydtdtm20.);
 format val datetime20.;
run;
proc print data=dt; run;

Obs,raw,val
1,2022-11-23T17:32:35Z,23NOV2022:17:32:35


https://communities.sas.com/t5/SAS-Communities-Library/How-to-convert-a-character-value-to-numeric-in-SAS/ta-p/847645

Convert a numeric variable to a character
To convert a numeric variable to a character, use the PUT function with the desired format.

Note that the length of the new variable must be large enough to store the new value. This example stores the current datetime value and displays it in a new character variable.


In [11]:
data dt;
 length dt 8 dt_char $ 20;
 dt = datetime();
 dt_char = put(dt, datetime20.);
run;
proc print data=dt; run;

Obs,dt,dt_char
1,1991155477,04FEB2023:18:44:37



#Preserving leading zeros

https://communities.sas.com/t5/SAS-Communities-Library/How-to-convert-a-character-value-to-numeric-in-SAS/ta-p/847645

If your data values require leading zeros as a significant component of the display, use the Zw.d format to ensure the leading zeros are included. A common use case is postal (ZIP) codes in the US:

In [12]:
data raw;
 length city $ 20 zip_num 8;
 infile datalines;
 input city zip_num;
datalines;
Williamsville 14221
Raleigh 27613
Boston 02134
;
run;

data better;
 set raw;
 /* always 5-digits inc any leading zeroes */
 zip_char = put(zip_num,z5.);
run;
proc print data=better; run;

Obs,city,zip_num,zip_char
1,Williamsville,14221,14221
2,Raleigh,27613,27613
3,Boston,2134,2134


[Week 4 - Code examples: Difference between strip, compress and
trim](https://communities.sas.com/t5/General-SAS-Programming/Difference-between-strip-compress-and-trim/td-p/323286)

(I have added the compbl function to the following code snippets.)

See also the doc:

STRIP function - removes all leading and trailing blanks

TRIM function - removes all trailing blanks

COMPRESS function - removes all blanks (by default - specify options to remove other chars)

COMPBL function - replaces Occurrences of multiple blanks with a single blank character



In [18]:
options nocenter nodate nonumber nosource;
ods html close;

data _null_;
    length text $15;
    format text $char15.;
    text = '  ab   cde  f   ';
    trim = '*'||trim(text)||'*';
    compress = '*'||compress(text)||'*';
    strip = '*'||strip(text)||'*';
    compbl = '*'||compbl(text)||'*';
    put text=;
    put trim= ;
    put compress= ;
    put strip=;
    put compbl=;
run;


The SAS System

376        ods listing close;ods html5 (id=saspy_internal) file=_tomods1 options(bitmap_mode='inline') device=svg style=HTMLBlue;
376      ! ods graphics on / outputfmt=png;
[38;5;21mNOTE: Writing HTML5(SASPY_INTERNAL) Body file: _TOMODS1[0m
377        
378        options nocenter nodate nonumber nosource;

text=ab   cde  f
trim=*  ab   cde  f*
compress=*abcdef*
strip=*ab   cde  f*
compbl=* ab cde f *
[38;5;21mNOTE: DATA statement used (Total process time):
      real time           0.00 seconds
      cpu time            0.00 seconds
      [0m


The SAS System

E3969440A681A2408885998500000020
